SlideShare a Scribd company logo
1 of 109
Download to read offline
Object Detection
& Instance
SegmentationYoloV5, Faster-RCNN, EfficientDet d7,
Mask-RCNN, Detectron2, and TF2
Object Detection API
Hichem Felouat - Algeria 2020
hichemfel@gmail.com
https://www.researchgate.net/profile/Hichem_Felouat
https://www.linkedin.com/in/hichemfelouat/
Object Detection
2Hichem Felouat - Algeria - hichemfel@gmail.com
• We all know about the image classification problem. Given an image, the
Net finds out the class to which the image belongs.
• Localizing an object in a picture means predicting a bounding box around
the object and can be expressed as a regression task.
3Hichem Felouat - Algeria - hichemfel@gmail.com
Object Detection
Problem: the dataset does not have bounding
boxes around the objects, how can we train our
model?
• We need to add them ourselves. This is often one of the
hardest and most costly parts of a Machine Learning
project: getting the labels.
• It is a good idea to spend time looking for the right tools.
4Hichem Felouat - Algeria - hichemfel@gmail.com
Image Labeling Tool
An image labeling or annotation tool is used to label the images for
bounding box object detection and segmentation.
Open-source image labeling tool like :
• VGG Image
• Annotator
• LabelImg
• OpenLabeler
• ImgLab
Commercial tool like :
• LabelBox
• Supervisely
Crowdsourcing Platform like :
• Amazon Mechanical Turk
5Hichem Felouat - Algeria - hichemfel@gmail.com
Image Labeling Tool - VGG Image
VGG Image: http://www.robots.ox.ac.uk/~vgg/software/via/
Hichem Felouat - Algeria - hichemfel@gmail.com 6
Image Labeling Tool - labelImg
labelimg
Hichem Felouat - Algeria - hichemfel@gmail.com 7
Object Detection - Notes
• The bounding boxes should be normalized so that the
horizontal and vertical coordinates, as well as the
height and width, all range from 0 to 1.
• It is common to predict the square root of the height
and width rather than the height and width directly: this
way, a 10-pixel error for a large bounding box will not be
penalized as much as a 10-pixel error for a small
bounding box.
Hichem Felouat - Algeria - hichemfel@gmail.com 8
How to Evaluate Object Detection Model?
• The MSE often works fairly well as a cost function to train the
model, but it is not a great metric to evaluate how well the model
can predict bounding boxes.
• The most common metric for this is the Intersection over Union
(IoU).
• tf.keras.metrics.MeanIoU
Hichem Felouat - Algeria - hichemfel@gmail.com 9
mean Average Precision (mAP)
In order to calculate mAP, we draw a series of precision-recall
curves with the IoU threshold set at varying levels of difficulty. In
COCO evaluation, the IoU threshold ranges from 0.5 to 0.95 with a
step size of 0.05 represented as AP@[.5:.05:.95]
Evaluate Object Detection Model 2 mAP
Hichem Felouat - Algeria - hichemfel@gmail.com 10
Evaluate Object Detection Model 2 mAP
We draw these precision-recall curves for the dataset split out by
class type (for example 3 classes and 6 threshold ranges).
Hichem Felouat - Algeria - hichemfel@gmail.com 11
Evaluate Object Detection Model 2 mAP
These precision and recall values are then plotted to get a PR
(precision-recall) curve. The area under the PR curve is called
Average Precision (AP).
• For each class, calculate AP at different IoU thresholds and take
their average to get the AP of that class.
Exp: in PASCAL VOC challenge 2007, it
is defined as the mean of precision values
at a set of 11 equally spaced recall levels
[0,0.1,…,1] (0 to 1 at step size of 0.1).
Object detection metrics https://github.com/rafaelpadilla/Object-Detection-Metrics
Hichem Felouat - Algeria - hichemfel@gmail.com 12
Evaluate Object Detection Model 2 mAP
• Calculate the final AP by averaging the AP over different classes.
mAP-IoU thresholds
mAP - example COCO dataset
13Hichem Felouat - Algeria - hichemfel@gmail.com
Object Detection - Example
(C, X,Y, W, H)
raw image
416*416
image labeling
(C, X,Y, W, H)
image result
model
• Each item should be a tuple of the form :
(images,
(class labels, bounding boxes) )
14Hichem Felouat - Algeria - hichemfel@gmail.com
Object Detection - Example
Full code: https://www.kaggle.com/hichemfelouat/braintumorlocalization
15Hichem Felouat - Algeria - hichemfel@gmail.com
Object Detection - Exp TL
Full code: https://www.kaggle.com/hichemfelouat/braintumorlocalization
Hichem Felouat - Algeria - hichemfel@gmail.com 16
Multiple Objects Detection
The task of classifying and localizing multiple objects in an
image is called object detection.
Detecting multiple objects by
sliding a CNN across the image.
Hichem Felouat - Algeria - hichemfel@gmail.com 17
Multiple Objects Detection
1. You need to add an extra objectness output to your CNN, to
estimate the probability that an object is indeed present in the
image.
2. Find the bounding box with the highest objectness score,
and get rid of all the other bounding boxes that overlap a lot with
it (IoU).
3. Repeat step two until there are no more bounding boxes to get
rid of.
Hichem Felouat - Algeria - hichemfel@gmail.com 18
Multiple Objects Detection
In general, object detectors have three (3) main components:
1) The backbone that extracts features from the given image.
2) The feature network that takes multiple levels of features from
the backbone as input and outputs a list of fused features that
represent salient characteristics of the image.
3) The final class/box network that uses the fused features to
predict the class and location of each object.
Hichem Felouat - Algeria - hichemfel@gmail.com 19
What is Yolo?
• You Only Look Once (YOLO) is an algorithm that uses
convolutional neural networks for object detection.
• It is one of the faster object detection algorithms out
there.
• It is a very good choice when we need real-time
detection, without loss of too much accuracy.
You Only Look Once: Unified, Real-Time Object Detection https://arxiv.org/abs/1506.02640
YOLOV3: https://arxiv.org/abs/1804.02767
YOLOV4: https://arxiv.org/abs/2004.10934
Hichem Felouat - Algeria - hichemfel@gmail.com 20
What is Yolo?
YoloV4 architecture
Hichem Felouat - Algeria - hichemfel@gmail.com 21
What is Yolo?
Backbone
refers to the feature-extraction architecture. Tiny YOLO has only 9
convolutional layers, so it’s less accurate but faster and better
suited for mobile and embedded projects. Darknet53 ( The
backbone used in YOLOV3) has 53 convolutional layers, so it’s
more accurate but slower.
In YOLOv4, backbones can be VGG, ResNet, SpineNet,
EfficientNet, ResNeXt, or Darknet53.
Hichem Felouat - Algeria - hichemfel@gmail.com 22
What is Yolo?
Neck
additional layers between the backbone and the head (dense
prediction block), its purpose is to add extra information in the
layers. ( Feature Pyramid Networks, Path Aggregation Network ...).
Head
The head block is the part used to: locate bounding boxes and
classify what’s inside each box.
Sparse Prediction
used in two-stage-detection algorithms.
Hichem Felouat - Algeria - hichemfel@gmail.com 23
Interpreting The YOLO Output
• The input is a batch of images of shape (m, 608, 608, 3).
• The output is a list of bounding boxes along with the recognized classes.
Each bounding box is represented by 6 numbers (pc, bx, by, bh, bw, c). If
we expand c into an 80-dimensional vector (The number of classes), each
bounding box is then represented by 85 numbers.
• To do that, we divide the input image into a grid of dimensions equal to
that of the final feature map. For example, the input image is 608 x 608, and
the dimensions of the feature map are 19 x 19. So the stride of the network
will be 32 (608/19=32).
Hichem Felouat - Algeria - hichemfel@gmail.com 24
Interpreting The YOLO Output
• IMAGE (m, 608, 608, 3) -> DEEP CNN -> ENCODING (m, 19, 19, 5, 85).
• We are using 5 anchor boxes, each of the 19x19 cells thus encodes information about 5
boxes (YOLOV3).
Hichem Felouat - Algeria - hichemfel@gmail.com 25
Interpreting The YOLO Output
• Now, for each box (of each cell) we will compute the following elementwise
product and extract a probability that the box contains a certain class.
Hichem Felouat - Algeria - hichemfel@gmail.com 26
What does YOLO Predict on an Image
Hichem Felouat - Algeria - hichemfel@gmail.com 27
Output Processing
We will carry out these steps:
• Get rid of boxes with a low score (meaning, the box is not very confident
about detecting a class (exp: pc < 0.5)).
• Select only one box when several boxes overlap with each other and detect
the same object (Non-max suppression).
Hichem Felouat - Algeria - hichemfel@gmail.com 28
YOLOV5
• YOLOv5 was released by Glenn Jocher on June 9, 2020. It
follows the recent releases of YOLOv4 (April 23, 2020).
• YOLOv5 is smaller and generally easier to use in production.
Given it is natively implemented in PyTorch (rather than
Darknet), modifying the architecture and exporting to many
deploy environments is straightforward.
• YOLOv5s is about 88% smaller than big-YOLOv4 (27 MB vs
244 MB).
ultralytics /yolov5 :
https://github.com/ultralytics/yolov5
Hichem Felouat - Algeria - hichemfel@gmail.com 29
YOLOV5
Hichem Felouat - Algeria - hichemfel@gmail.com 30
YOLOV5
Hichem Felouat - Algeria - hichemfel@gmail.com 31
Inference - Google Colab
# YOLOV5
!git clone https://github.com/ultralytics/yolov5 # clone repo
!pip install -U -r yolov5/requirements.txt # install dependencies
%cd /content/yolov5
# ******************************************************************************
!pip install -r requirements.txt # install dependencies
%cd yolov5
Hichem Felouat - Algeria - hichemfel@gmail.com 32
Inference - Google Colab
Hichem Felouat - Algeria - hichemfel@gmail.com 33
Inference - Google Colab
Inference can be run on most common media formats. Model checkpoints are
downloaded automatically if available. Results are saved
to ./inference/output.
!python detect.py --source 0 # webcam
file.jpg # image
file.mp4 # video
path/ # directory
path/*.jpg # glob
rtsp://170.93.143.139/rtplive/470011e600ef003a004ee33696235daa # rtsp stream
http://112.50.243.8/PLTV/88888888/224/3221225900/1.m3u8 # http stream
ultralytics /yolov5 :
https://github.com/ultralytics/yolov5
Hichem Felouat - Algeria - hichemfel@gmail.com 34
Inference - Example
# Inference
%cd /content/yolov5
#!python detect.py --source /content/gdrive/My Drive/hichem data/img.jpg --weights yolov5x.pt --conf 0.4
# --weights (yolov5s.pt, yolov5m.pt, yolov5l.pt, yolov5x.pt)
!python detect.py --source inference/images/zidane.jpg --weights yolov5l.pt --conf 0.4
import torch
from IPython.display import Image, clear output # to display images
Image(filename='inference/output/zidane.jpg', width=600)
# Image(filename='inference/output/bus.jpg', width=600
Hichem Felouat - Algeria - hichemfel@gmail.com 35
Inference - Example
Hichem Felouat - Algeria - hichemfel@gmail.com 36
Custom object detection data in YOLOV5 format - image
public blood cell detection dataset
Train YOLOV5 On a Custom Dataset
Hichem Felouat - Algeria - hichemfel@gmail.com 37
Custom object detection data in YOLOv5 format - label
Label: class bx by bh bw.txt
Train YOLOV5 On a Custom Dataset
Hichem Felouat - Algeria - hichemfel@gmail.com 38
Train YOLOV5 On a Custom Dataset
Download custom object detection data in YOLOV5 format from
Roboflow.
# Download Custom Dataset
%mkdir /content/my dataset/
%cd /content/my dataset
!curl -L "YOUR LINK HERE" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip
Roboflow simplifies your computer workflow from data organization,
annotation verification, preprocessing, augmentations to exporting to your
required model format. https://app.roboflow.ai/login
Getting Started with Roboflow : https://blog.roboflow.com/getting-started-with-roboflow/
Hichem Felouat - Algeria - hichemfel@gmail.com 39
Use the "YOLOv5 PyTorch"
export format. Note that the
Ultralytics implementation
calls for a YAML file defining
where your training and test
data is. The Roboflow export
also writes this format for us.
Train YOLOV5 On a Custom Dataset
Hichem Felouat - Algeria - hichemfel@gmail.com 40
Define Model Configuration and Architecture
data.yaml file is for specifying the location of a YOLOv5 images
folder, a YOLOv5 labels folder, and information on our custom
classes.
Hichem Felouat - Algeria - hichemfel@gmail.com 41
Define Model Configuration and Architecture
My dataset
Test set
Test images
Test labels
data.yaml
label
An example of how to
organize a dataset
Hichem Felouat - Algeria - hichemfel@gmail.com 42
Define Model Configuration and Architecture
1) W e w i l l w r i t e a n e w y a m l s c r i p t f i l e ( e x p :
my model l.ymal) that defines the parameters for our
model like the number of classes, anchors, and each
layer.
2) Copy then paste the script from one of the files
(yolov5s, m, l, x) into your ymal script file (exp:
my model l.ymal).
3) Update the parameters.
Hichem Felouat - Algeria - hichemfel@gmail.com 43
Define Model Configuration and Architecture
Hichem Felouat - Algeria - hichemfel@gmail.com 44
Train YoloV5 on Custom Dataset
# Train Yolov5 on Custom Dataset
# weights :
# https://drive.google.com/drive/folders/1Drs_Aiu7xx6S-
ix95f9kNsA6ueKRpN2J
%%time
%cd /content/yolov5/
!python train.py --img 416 --batch 16 --epochs 200 --data
/content/my_dataset/data.yaml --cfg models/my_model_l.yaml
--weights yolov5l.pt --name my_model_l_results --cache
Hichem Felouat - Algeria - hichemfel@gmail.com 45
Train YoloV5 on Custom Dataset
• img: define input image size
• batch: determine batch size
• epochs: define the number of training epochs. (Note: often,
3000+ are common here!)
• data: set the path to our yaml file
• cfg: specify our model configuration
• weights: specify a custom path to weights. (Note: you can
download weights from the Ultralytics Google Drive folder)
• name: result names
• nosave: only save the final checkpoint
• cache: cache images for faster training
Hichem Felouat - Algeria - hichemfel@gmail.com 46
Evaluate Custom YOLOV5 Detector Performance
# Evaluate Custom YOLOV5 Detector Performance
%load ext tensorboard
%tensorboard --logdir runs
# we can also output some older school graphs if the tensor board isn't working for whatever reason...
Image(filename='/content/yolov5/runs/exp0 my model l results/results.png', width=1000) # view results.png
Hichem Felouat - Algeria - hichemfel@gmail.com 47
Run Inference With Trained Weights
# Inference 1 image :
%cd /content/yolov5/
!python detect.py --weights /content/yolov5/runs/exp0_my_model_l_results/weights/best.pt --
img 416 --conf 0.5 --source
/content/my_dataset/test/images/BloodImage_00038_jpg.rf.63da20f3f5538d0d2be8c4633c703
4a1.jpg
# display 1 image :
import torch
from IPython.display import Image, clear_output # to display images
Image(filename='/content/yolov5/inference/output/BloodImage_00038_jpg.rf.63da20f3f5538d0d
2be8c4633c7034a1.jpg', width=600)
Hichem Felouat - Algeria - hichemfel@gmail.com 48
Run Inference With Trained Weights
# Inference all test images :
%cd /content/yolov5/
!python detect.py --weights
/content/yolov5/runs/exp0_my_model_l_results/weights/best.pt --img 416 --conf
0.5 --source /content/my_dataset/test/images
#display inference on ALL test images
import glob
from IPython.display import Image, display
for imageName in glob.glob('/content/yolov5/inference/output/*.jpg'):
display(Image(filename=imageName))
print("n")
Hichem Felouat - Algeria - hichemfel@gmail.com 49
Export Saved YOLOV5 Weights for Future Inference
# Export Saved YOLOv5 Weights for Future Inference
from google.colab import drive
drive.mount('/content/gdrive')
%cp /content/yolov5/runs/exp2 my model results/weights/best my model results.pt
/content/gdrive/My Drive
We recommend following the full code in this YOLOv5 Colab Notebook:
https://colab.research.google.com/drive/1gDZ2xcTOgR39tGGs-EZ6i3RTs16wmzZQ
How to Train YOLOv5 On a Custom Dataset:
https://blog.roboflow.com/how-to-train-yolov5-on-a-custom-dataset/
Hichem Felouat - Algeria - hichemfel@gmail.com 50
Faster R-CNN
• Faster R-CNN (Faster Region-based Convolutional
Neural Network) is now a canonical model for deep
learning-based object detection. It helped inspire many
detection and segmentation models that came after it.
• Faster R-CNN was originally published in NIPS 2015.
• We can not understand Faster R-CNN without
understanding its own predecessors, R-CNN and
Fast R-CNN.
Hichem Felouat - Algeria - hichemfel@gmail.com 51
R-CNN
R-CNN (Region-based Convolutional Neural Network) (2014):
• Scan the input image for possible objects using an algorithm called
Selective Search, generating ~2000 region proposals.
• Run a convolutional neural net (CNN) on top of each of these region
proposals.
• Take the output of each CNN and feed it into a) an SVM to classify the
region and b) a linear regressor to tighten the bounding box of the object if
such an object exists.
Selective Search for Object Recognition: https://link.springer.com/article/10.1007%252Fs11263-013-0620-5
R-CNN: https://arxiv.org/abs/1311.2524
Hichem Felouat - Algeria - hichemfel@gmail.com 52
R-CNN
Hichem Felouat - Algeria - hichemfel@gmail.com 53
Fast R-CNN
The purpose of the Fast Region-based Convolutional Network (Fast R-CNN) is
to reduce the time consumption related to the high number of models
necessary to analyze all region proposals.
• Fast R-CNN developed by Ross Girshick in 2015.
• Performing feature extraction over the image before proposing regions, thus
only running one CNN over the entire image instead of 2000 CNN’s over
2000 overlapping regions.
• Replacing the SVM with a softmax layer, thus extending the neural
network for predictions instead of creating a new model.
Fast R-CNN: https://arxiv.org/abs/1504.08083
Hichem Felouat - Algeria - hichemfel@gmail.com 54
Fast R-CNN
Hichem Felouat - Algeria - hichemfel@gmail.com 55
Faster R-CNN
• The input image is passed through a pre-trained CNN (The original Faster
R-CNN used VGG), then we use the convolutional feature map we get for
the next part.
• Next, the Region Proposal Network (RPN) uses the features that the CNN
computed to find up to a predefined number of regions (bounding boxes),
which may contain objects. It generates multiple possible regions based on
k fixed-ratio anchor boxes (default bounding boxes). RPN accelerates the
training and testing processes.
• Finally, comes the R-CNN module, which uses that information to: Classify
the content in the bounding box (or discard it, using “background” as a
label). Adjust the bounding box coordinates (so it better fits the object).
Hichem Felouat - Algeria - hichemfel@gmail.com 56
Faster R-CNN
Faster R-CNN: Down the rabbit hole of modern object detection
https://tryolabs.com/blog/2018/01/18/faster-r-cnn-down-the-rabbit-hole-of-modern-object-detection/
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
https://arxiv.org/abs/1506.01497
Hichem Felouat - Algeria - hichemfel@gmail.com 57
• EfficientDets are a family of object detector models that are based on
EfficientNet and is reportedly much efficient than other states of the art
models.
• In the world of object detection balancing the trade-off between accuracy
and performance efficiency of the detectors is a major challenge; EfficientDet
attempts to minimize the trade-off and give the best detector both in terms
of accuracy and performance.
• EfficientDet-D7 achieves state-of-the-art 51.0 mAP on COCO dataset with
52M parameters and 326B FLOPS1 , being 4x smaller and using 9.3x
fewer FLOPS.
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks: https://arxiv.org/abs/1905.11946
EfficientDet: Scalable and Efficient Object Detection : https://arxiv.org/abs/1911.09070
Github: https://github.com/google/automl/tree/master/efficientdet
Hichem Felouat - Algeria - hichemfel@gmail.com 58
EfficientDet architecture. EfficientDet uses EfficientNet as the backbone network and a newly proposed bi-
directional feature pyramid network (BiFPN) feature network.
Hichem Felouat - Algeria - hichemfel@gmail.com 59
Hichem Felouat - Algeria - hichemfel@gmail.com 60
Instance Segmentation
Instance Segmentation aims to predicting the object class-label and the
pixel-specific object instance-mask, it localizes different classes of object
instances present in various images. Instance segmentation aims to help
largely robotics, autonomous driving, surveillance, etc.
• Classification: There is a balloon in this image.
• Semantic Segmentation: These are all the
balloon pixels.
• Object Detection: There are 7 balloons in this
image at these locations. We’re starting to
account for objects that overlap.
• Instance Segmentation: There are 7 balloons
at these locations, and these are the pixels that
belong to each one.
A Survey on Instance Segmentation: State of the art
https://arxiv.org/abs/2007.00047
Hichem Felouat - Algeria - hichemfel@gmail.com 61
Mask R-CNN
• Mask R-CNN (Mask Regional Convolutional Neural Network) is
a state of the art model for instance segmentation, developed
on top of Faster R-CNN.
• For a given image, Mask R-CNN, in addition to the class label
and bounding box coordinates for each object, will also return
the object mask.
• The Mask R-CNN model introduced in the 2018 paper titled
“Mask R-CNN”.
Mask R-CNN: https://arxiv.org/abs/1703.06870
Hichem Felouat - Algeria - hichemfel@gmail.com 62
Mask R-CNN
Hichem Felouat - Algeria - hichemfel@gmail.com 63
Detectron
• Detectron2 was built by Facebook AI Research (FAIR) to support
rapid implementation and evaluation of novel computer vision
research.
• Detectron2 is now implemented in PyTorch.
• Detectron2 is flexible and extensible, and able to provide fast
training on single or multiple GPU servers.
• Detectron2 can be used as a library to support different projects
on top of it.
Detectron2: A PyTorch-based modular object detection library
https://ai.facebook.com/blog/-detectron2-a-pytorch-based-modular-object-detection-library-/
Detectron2 :
https://github.com/facebookresearch/detectron2?fbclid=IwAR2CdXQoTU9i-ebKPZIc7BQw8R6NKgp0B-
yUkGr1BF3w1VKWzNhxFHi6Zbw
detectron2’s documentation: https://detectron2.readthedocs.io/
Hichem Felouat - Algeria - hichemfel@gmail.com 64
Detectron Model Zoo
Hichem Felouat - Algeria - hichemfel@gmail.com 65
Detectron Model Zoo
Detectron2 Model Zoo : https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md
Hichem Felouat - Algeria - hichemfel@gmail.com 66
Detectron - Google Colab
# install dependencies:
!pip install pyyaml==5.1 pycocotools>=2.0.1
import torch, torchvision
print(torch. version , torch.cuda.is available())
!gcc --version
# install detectron2: (Colab has CUDA 10.1 + torch 1.6)
# See https://detectron2.readthedocs.io/tutorials/install.html for instructions
assert torch. version .startswith("1.6")
!pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.6/index.html
Hichem Felouat - Algeria - hichemfel@gmail.com 67
Detectron - Inference
# Some basic setup:
# Setup detectron2 logger
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()
# import some common libraries
import numpy as np
import os, json, cv2, random
from google.colab.patches import cv2_imshow
# import some common detectron2 utilities
from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog
# Load an image :
img = cv2.imread("/content/img.png")
cv2_imshow(img)
Detectron2 Beginner's Tutorial : Colab
https://colab.research.google.com/drive/16jcaJoc6bCF
AQ96jDe2HwtXj7BMD_-m5
Hichem Felouat - Algeria - hichemfel@gmail.com 68
Detectron - Inference
# MODEL_ZOO : Choose the link of the algorithm https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md
# faster_rcnn
model_link1 = "COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x.yaml"
# mask_rcnn
model_link2 = "COCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x.yaml"
# keypoint_rcnn
model_link3 = "COCO-Keypoints/keypoint_rcnn_X_101_32x8d_FPN_3x.yaml"
# Run a pre-trained detectron2 model :
# we create a detectron2 config and a detectron2 DefaultPredictor to run inference on this image.
cfg = get_cfg()
# add project-specific config (e.g., TensorMask) here if you're not running a model in detectron2's core library
cfg.merge_from_file(model_zoo.get_config_file(model_link2))
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 # set threshold for this model
# Find a model from detectron2's model zoo. You can use the https://dl.fbaipublicfiles... url as well
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(model_link2)
predictor = DefaultPredictor(cfg)
outputs = predictor(img)
Hichem Felouat - Algeria - hichemfel@gmail.com 69
Detectron - Inference
# Show results
print("-----------------------")
print("outputs : n",outputs)
print("-----------------------")
print("pred_classes : n",outputs["instances"].pred_classes)
print("pred_boxes : n",outputs["instances"].pred_boxes)
# pred_masks for mask_rcnn : outputs["instances"].pred_masks
# We can use `Visualizer` to draw the predictions on the image.
vis = Visualizer(img[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
out = vis.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2_imshow(out.get_image()[:, :, ::-1])
"""
# PanopticSegmentation
model_link = "COCO-PanopticSegmentation/panoptic_fpn_R_101_3x.yaml"
.
.
predictor = DefaultPredictor(cfg)
panoptic_seg, segments_info = predictor(im)["panoptic_seg"]
v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
out = v.draw_panoptic_seg_predictions(panoptic_seg.to("cpu"), segments_info)
cv2_imshow(out.get_image()[:, :, ::-1])
"""
Hichem Felouat - Algeria - hichemfel@gmail.com 70
Detectron - Inference
Hichem Felouat - Algeria - hichemfel@gmail.com 71
Train Detectron on a custom dataset
Convert your data-set to COCO-format :
The COCO dataset is formatted in JSON and has five annotation
types: object detection, keypoint detection, stuff segmentation,
panoptic segmentation, and image captioning. It is a collection of
“info”, “licenses”, “images”, “annotations”, “categories” (in most
cases), and “segment info” (in one case).
• If your dataset is not in coco format, you have to convert it.
Hichem Felouat - Algeria - hichemfel@gmail.com 72
Train Detectron on a custom dataset{
"info": {
"year": "2020",
"version": "1",
"description": "Exported from roboflow.ai",
"contributor": "Roboflow",
"url": "https://public.roboflow.ai/object-detection/bccd",
"date_created": "2020-03-16T01:31:26+00:00"
},
"licenses": [
{
"id": 1,
"url": "https://creativecommons.org/publicdomain/zero/1.0/",
"name": "Public Domain"
}
],
"categories": [
{
"id": 0,
"name": "cells",
"supercategory": "none"
},
{
"id": 1,
"name": "Platelets",
"supercategory": "cells"
},
{
"id": 2,
"name": "RBC",
"supercategory": "cells"
},
{
"id": 3,
"name": "WBC",
"supercategory": "cells"
}
],
"images": [
{
"id": 0,
"license": 1,
"file_name": "BloodImage_00240_jpg.rf.02e03d6cdaa6a0c4446fbede5b024dd2.jpg",
"height": 416,
"width": 416,
"date_captured": "2020-03-16T01:31:26+00:00"
},
"annotations": [
{
"id": 0,
"image_id": 0,
"category_id": 2,
"bbox": [
242,
240,
67.60000000000001,
93.60000000000001
],
"area": 6327.3600000000015,
"segmentation": [],
"iscrowd": 0
}
]
}
COCO-format
Hichem Felouat - Algeria - hichemfel@gmail.com 73
Train Detectron on a custom dataset
Import and Register Custom Detectron2 Data in COCO JSON format
From Roboflow :
# Download Custom Dataset
# COCO JSON
%mkdir /content/my dataset/
%cd /content/my dataset
!curl -L "{YOUR LINK HERE}" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip
Full code : https://colab.research.google.com/drive/1-TNOcPm3Jr3fOJG8rnGT9gh60mHUsvaW#scrollTo=zVBjf0DE7HEW
Hichem Felouat - Algeria - hichemfel@gmail.com 74
Train Detectron on a custom dataset
Public blood cell dataset
Hichem Felouat - Algeria - hichemfel@gmail.com 75
Train Detectron on a custom dataset
• Detectron2 keeps track of a list of available datasets in a registry,
so we must register our custom data with Detectron2 so it can be
invoked for training.
# Register Custom Detectron2 Data
from detectron2.data.datasets import register_coco_instances
register_coco_instances("my_dataset_train", {}, "/content/my_dataset/train/_annotations.coco.json",
"/content/my_dataset/train")
register_coco_instances("my_dataset_val", {}, "/content/my_dataset/valid/_annotations.coco.json",
"/content/my_dataset/valid")
register_coco_instances("my_dataset_test", {}, "/content/my_dataset/test/_annotations.coco.json",
"/content/my_dataset/test")
Hichem Felouat - Algeria - hichemfel@gmail.com 76
Train Detectron on a custom dataset
• If you want to use a custom dataset while also reusing detectron2’s data
loaders, you will need to: Register your dataset (i.e., tell detectron2 how to
obtain your dataset). Optionally, register metadata for your dataset.
• To let detectron2 know how to obtain a dataset named “my dataset”, users
need to implement a function that returns the items in your dataset and
then tell detectron2 about this function. Or you can convert your annotation
to COCO format.
Detectron2 custom dataset tutorial: https://detectron2.readthedocs.io/tutorials/datasets.html
Hichem Felouat - Algeria - hichemfel@gmail.com 77
Train Detectron on a custom dataset
# visualize training data
my_dataset_train_metadata = MetadataCatalog.get("my_dataset_train")
dataset_dicts = DatasetCatalog.get("my_dataset_train")
import random
from detectron2.utils.visualizer import Visualizer
for d in random.sample(dataset_dicts, 3):
img = cv2.imread(d["file_name"])
visualizer = Visualizer(img[:, :, ::-1], metadata=my_dataset_train_metadata, scale=0.5)
out = visualizer.draw_dataset_dict(d)
cv2_imshow(out.get_image()[:, :, ::-1])
print("n")
Hichem Felouat - Algeria - hichemfel@gmail.com 78
Train Detectron on a custom dataset
# We are importing our own Trainer Module here to use the COCO validation
# evaluation during training. Otherwise no validation eval occurs.
from detectron2.engine import DefaultTrainer
from detectron2.evaluation import COCOEvaluator
class CocoTrainer(DefaultTrainer):
@classmethod
def build_evaluator(cls, cfg, dataset_name, output_folder=None):
if output_folder is None:
os.makedirs("coco_eval", exist_ok=True)
output_folder = "coco_eval"
return COCOEvaluator(dataset_name, cfg, False, output_folder)
Hichem Felouat - Algeria - hichemfel@gmail.com 79
Train Detectron on a custom dataset
model_link = "COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x.yaml"
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file(model_link))
cfg.DATASETS.TRAIN = ("my_dataset_train",)
cfg.DATASETS.TEST = ("my_dataset_val",)
cfg.DATALOADER.NUM_WORKERS = 4 # Number of data loading threads.
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(model_link) # Let training initialize from model zoo.
cfg.SOLVER.IMS_PER_BATCH = 4 # Number of images per batch across all machines.
cfg.SOLVER.BASE_LR = 0.001
cfg.SOLVER.WARMUP_ITERS = 1000
cfg.SOLVER.MAX_ITER = 1500 # Adjust up if val mAP is still rising, adjust down if overfit.
cfg.SOLVER.STEPS = (1000, 1500) # The iteration number to decrease learning rate by GAMMA.
cfg.SOLVER.GAMMA = 0.05
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 64
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 4 # Your number of classes
cfg.TEST.EVAL_PERIOD = 100 # The period (in terms of steps) to evaluate the model during training.
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = CocoTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()
Detectron2.config package
https://detectron2.readthedocs.io/modules/config.html
Hichem Felouat - Algeria - hichemfel@gmail.com 80
Train Detectron on a custom dataset
# Test evaluation
from detectron2.data import DatasetCatalog, MetadataCatalog, build_detection_test_loader
from detectron2.evaluation import COCOEvaluator, inference_on_dataset
cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.85
predictor = DefaultPredictor(cfg)
evaluator = COCOEvaluator("my_dataset_test", cfg, False, output_dir="./output/")
val_loader = build_detection_test_loader(cfg, "my_dataset_test")
inference_on_dataset(trainer.model, val_loader, evaluator)
# Look at training curves in tensorboard
%load_ext tensorboard
%tensorboard --logdir output
Hichem Felouat - Algeria - hichemfel@gmail.com 81
Train Detectron on a custom dataset
# Inference & evaluation using the trained model
# cfg already contains everything we've set previously. Now we changed it a little bit for inference:
cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.DATASETS.TEST = ("my_dataset_test", )
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7 # set the testing threshold for this model
predictor = DefaultPredictor(cfg)
test_metadata = MetadataCatalog.get("my_dataset_test")
from detectron2.utils.visualizer import ColorMode
import glob
for imageName in glob.glob("/content/my_dataset/test/*jpg")):
img = cv2.imread(imageName)
outputs = predictor(img)
v = Visualizer(img[:, :, ::-1],
metadata=test_metadata,
scale=0.8
)
out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2_imshow(out.get_image()[:, :, ::-1])
print("n")
Hichem Felouat - Algeria - hichemfel@gmail.com 82
Train Detectron - Save/Load Trained Model
# Save model final.pth
%cp /content/my dataset/output/model final.pth /content/gdrive/My Drive
my cfg = get cfg()
my cfg.merge from file(model zoo.get config file("COCO-Detection/faster rcnn X 101 32x8d FPN 3x.yaml"))
my cfg.MODEL.ROI HEADS.SCORE THRESH TEST = 0.5
my cfg.MODEL.WEIGHTS = "/content/gdrive/My Drive/model final.pth"
my cfg.MODEL.ROI HEADS.NUM CLASSES = 4
my cfg.DATASETS.TEST = ("my dataset test", )
my model detectron2 = DefaultPredictor(my cfg)
test metadata = MetadataCatalog.get("my dataset test")
from detectron2.utils.visualizer import ColorMode
import glob
for imageName in glob.glob("/content/my dataset/test/*jpg"):
im = cv2.imread(imageName)
outputs = my model detectron2(im)
print("Classes : n", outputs["instances"].pred classes)
print("Scores : n", outputs["instances"].scores)
v = Visualizer(im[:, :, ::-1],
metadata=test metadata,
scale=0.8
)
out = v.draw instance predictions(outputs["instances"].to("cpu"))
cv2 imshow(out.get image()[:, :, ::-1])
print("n")
Hichem Felouat - Algeria - hichemfel@gmail.com 83
Instance Segmentation on a custom dataset
# Download, decompress the data
%mkdir /content/my dataset/
%cd /content/my dataset
!wget https://github.com/TannerGilbert/Detectron2-Train-a-Instance-Segmentation-Model/raw/master/microcontroller segmentation data.zip
!unzip microcontroller segmentation data.zip
Full code : https://github.com/TannerGilbert/Detectron2-Train-a-Instance-Segmentation-Model
Hichem Felouat - Algeria - hichemfel@gmail.com 84
Instance Segmentation on a custom dataset
Here, the dataset is in its custom format, therefore we write a function to parse
it and prepare it into detectron2's standard format. User should write such a
function when using a dataset in custom format (get my dataset dicts). See
the tutorial for more details. https://detectron2.readthedocs.io/tutorials/datasets.html
Hichem Felouat - Algeria - hichemfel@gmail.com 85
Instance Segmentation on a custom datasetfrom detectron2.structures import BoxMode
def get_my_dataset_dicts(directory):
classes = ['Raspberry_Pi_3', 'Arduino_Nano', 'ESP8266', 'Heltec_ESP32_Lora']
dataset_dicts = []
for filename in [file for file in os.listdir(directory) if file.endswith('.json')]:
json_file = os.path.join(directory, filename)
with open(json_file) as f:
img_anns = json.load(f)
record = {}
filename = os.path.join(directory, img_anns["imagePath"])
record["file_name"] = filename
record["height"] = 600
record["width"] = 800
annos = img_anns["shapes"]
objs = []
for anno in annos:
px = [a[0] for a in anno['points']]
py = [a[1] for a in anno['points']]
poly = [(x, y) for x, y in zip(px, py)]
poly = [p for x in poly for p in x]
obj = {
"bbox": [np.min(px), np.min(py), np.max(px), np.max(py)],
"bbox_mode": BoxMode.XYXY_ABS,
"segmentation": [poly],
"category_id": classes.index(anno['label']),
"iscrowd": 0
}
objs.append(obj)
record["annotations"] = objs
dataset_dicts.append(record)
return dataset_dicts
Hichem Felouat - Algeria - hichemfel@gmail.com 86
Instance Segmentation on a custom dataset
from detectron2.data import DatasetCatalog, MetadataCatalog
for d in ["train", "test"]:
DatasetCatalog.register("my_dataset_" + d, lambda d=d:
get_my_dataset_dicts("/content/my_dataset/Microcontroller Segmentation/" + d))
MetadataCatalog.get("my_dataset_" + d).set(thing_classes=["Raspberry_Pi_3", "Arduino_Nano", "ESP8266",
"Heltec_ESP32_Lora"])
my_dataset_metadata = MetadataCatalog.get("my_dataset_train")
# visualize training data
import random
dataset_dicts = get_my_dataset_dicts("/content/my_ataset/Microdcontroller Segmentation/train")
for d in random.sample(dataset_dicts, 3):
img = cv2.imread(d["file_name"])
visualizer = Visualizer(img[:, :, ::-1], metadata=my_dataset_metadata, scale=0.5)
out = visualizer.draw_dataset_dict(d)
cv2_imshow(out.get_image()[:, :, ::-1])
Hichem Felouat - Algeria - hichemfel@gmail.com 87
Instance Segmentation on a custom dataset
from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg
import os
# mask_rcnn
model_link = "COCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x.yaml"
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file(model_link))
cfg.DATASETS.TRAIN = ("my_dataset_train",)
cfg.DATASETS.TEST = ()
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(model_link)
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.00025
cfg.SOLVER.MAX_ITER = 1500
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 4
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()
Hichem Felouat - Algeria - hichemfel@gmail.com 88
Instance Segmentation on a custom dataset
# Inference & evaluation using the trained model
# cfg already contains everything we've set previously. Now we changed it a little bit for inference:
cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 # set a custom testing threshold
cfg.DATASETS.TEST = ("my_dataset_test", )
predictor = DefaultPredictor(cfg)
from detectron2.utils.visualizer import ColorMode
dataset_dicts = get_my_dataset_dicts("/content/my_dataset/Microcontroller Segmentation/test")
for d in random.sample(dataset_dicts, 5):
img = cv2.imread(d["file_name"])
outputs = predictor(img)
v = Visualizer(img[:, :, ::-1],
metadata=my_dataset_metadata,
scale=0.5,
instance_mode=ColorMode.IMAGE_BW
)
out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2_imshow(out.get_image()[:, :, ::-1])
print("n")
Hichem Felouat - Algeria - hichemfel@gmail.com 89
TensorFlow Object Detection API
• The TensorFlow Object Detection API is an open-
source framework built on top of TensorFlow that
makes it easy to construct, train, and deploy object
detection models.
• The TensorFlow Object Detection API allows you to
train a collection state of the art object detection
models under a unified framework.
TensorFlow Object Detection API : https://github.com/tensorflow/models/tree/master/research/object_detection
Hichem Felouat - Algeria - hichemfel@gmail.com 90
TensorFlow Object Detection API - Model Zoo
Model Zoo : https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md
Hichem Felouat - Algeria - hichemfel@gmail.com 91
TensorFlow Object Detection API - Inference
import os
import pathlib
# Clone the tensorflow models repository if it doesn't already exist
if "models" in pathlib.Path.cwd().parts:
while "models" in pathlib.Path.cwd().parts:
os.chdir("..")
elif not pathlib.Path("models").exists():
!git clone --depth 1 https://github.com/tensorflow/models
# Install the Object Detection API
%%bash
cd models/research/
protoc object_detection/protos/*.proto --python_out=.
cp object_detection/packages/tf2/setup.py .
python -m pip install .
# Run model builder test
!python /content/models/research/object_detection/builders/model_builder_tf2_test.py
Hichem Felouat - Algeria - hichemfel@gmail.com 92
TensorFlow Object Detection API - Inference
Download the model :
More models can be found in the TensorFlow 2 Detection Model Zoo. To use a
different model you will need the URL name of the specific model. This can be
done as follows:
Hichem Felouat - Algeria - hichemfel@gmail.com 93
TensorFlow Object Detection API - Inference
# Download and extract model
def download model(model_name, model_date):
base_url = "http://download.tensorflow.org/models/object_detection/tf2/"
model_file = model_name + ".tar.gz"
model_dir = tf.keras.utils.get_file(fname=model_name,
origin=base_url + model_date + "/" + model_file,
untar=True)
return str(model_dir)
# Faster R-CNN Inception ResNet V2 1024x1024 :
# http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_inception_resnet_v2_1024x1024_coco17_tpu-
8.tar.gz
# EfficientDet D7 1536x1536
# http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d7_coco17_tpu-32.tar.gz
# Mask R-CNN Inception ResNet V2 1024x1024
# http://download.tensorflow.org/models/object_detection/tf2/20200711/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-
8.tar.gz
MODEL_DATE = "20200711"
MODEL_NAME = "mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8"
PATH_TO_MODEL_DIR = download model(MODEL_NAME, MODEL_DATE)
Hichem Felouat - Algeria - hichemfel@gmail.com 94
# Download labels file
"""
Since the pre-trained model we will use has been trained on the COCO dataset,
we will need to download the labels file corresponding to this dataset, named
mscoco label map.pbtxt.
"""
def download labels(filename):
base_url = 'https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/data/'
label_dir = tf.keras.utils.get_file(fname=filename,
origin=base_url + filename,
untar=False)
label_dir = pathlib.Path(label_dir)
return str(label_dir)
LABEL_FILENAME = "mscoco_label_map.pbtxt"
PATH_TO_LABELS = download labels(LABEL_FILENAME)
TensorFlow Object Detection API - Inference
Hichem Felouat - Algeria - hichemfel@gmail.com 95
TensorFlow Object Detection API - Inference
# Load The Downloaded Model
import time
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as viz_utils
PATH_TO_SAVED_MODEL = PATH_TO_MODEL_DIR + "/saved_model"
print("Loading model...", end="")
start_time = time.time()
# Load saved model and build the detection model
my_detection_model = tf.saved_model.load(PATH_TO_SAVED_MODEL)
end_time = time.time()
elapsed_time = end_time - start_time
print("Done! Took {} seconds".format(elapsed_time))
# ******************************************************************************
# Load label map data (for plotting)
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS,
use_display_name=True)
Hichem Felouat - Algeria - hichemfel@gmail.com 96
TensorFlow Object Detection API - Inference
def load_image_into_numpy_array(path):
"""
Load an image from file into a numpy array.
Puts image into numpy array to feed into tensorflow graph.
Note that by convention we put it into a numpy array with shape
(height, width, channels), where channels=3 for RGB.
Args:
path: the file path to the image
Returns:
uint8 numpy array with shape (img_height, img_width, 3)
"""
return np.array(Image.open(path))
Hichem Felouat - Algeria - hichemfel@gmail.com 97
TensorFlow Object Detection API - Inference
def run_inference_for_single_image(image_path):
print("Running inference for : ",image_path)
image_np = load_image_into_numpy_array(image_path)
input_tensor = tf.convert_to_tensor(image_np)
input_tensor = input_tensor[tf.newaxis, ...]
detections = my_detection_model(input_tensor)
num_detections = int(detections.pop("num_detections"))
detections = {key: value[0, :num_detections].numpy()
for key, value in detections.items()}
detections["num_detections"] = num_detections
detections["detection_classes"] = detections["detection_classes"].astype(np.int64)
image_np_with_detections = image_np.copy()
# Visualizing the results
viz_utils.visualize_boxes_and_labels_on_image_array(
image_np_with_detections,
detections["detection_boxes"],
detections["detection_classes"],
detections["detection_scores"],
category_index,
use_normalized_coordinates=True,
max_boxes_to_draw=200,
min_score_thresh=.30,
agnostic_mode=False)
cv2_imshow(image_np_with_detections)
TensorFlow 2 Object Detection API tutorial:
https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/
TF2 ObjectDetectionAPI Inference.ipynb
https://colab.research.google.com/drive/1ossp9IRFgjIQ1-PbWqaChcgMfjDPGDyL?usp=sharing
Hichem Felouat - Algeria - hichemfel@gmail.com 98
TensorFlow Object Detection API - Inference
# Mask-RCNN
def run_inference_for_single_image(image_path):
print("Running inference for : ",image_path)
image_np = load_image_into_numpy_array(image_path)
# The input needs to be a tensor, convert it using `tf.convert_to_tensor`.
input_tensor = tf.convert_to_tensor(image_np)
# The model expects a batch of images, so add an axis with `tf.newaxis`.
input_tensor = input_tensor[tf.newaxis, ...]
# input_tensor = np.expand_dims(image_np, 0)
detections = my_detection_model(input_tensor)
# All outputs are batches tensors.
# Convert to numpy arrays, and take index [0] to remove the batch dimension.
# We're only interested in the first num_detections.
num_detections = int(detections.pop("num_detections"))
import itertools
detections = dict(itertools.islice(detections.items(), num_detections))
detections["num_detections"] = num_detections
image_np_with_detections = image_np.copy()
# Handle models with masks:
if "detection_masks" in detections:
# Reframe the the bbox mask to the image size.
detection_masks_reframed =
utils_ops.reframe_box_masks_to_image_masks(
detections["detection_masks"][0], detections["detection_boxes"][0],
image_np.shape[0], image_np.shape[1])
detection_masks_reframed = tf.cast(detection_masks_reframed > 0.5,
tf.uint8)
detections["detection_masks_reframed"] =
detection_masks_reframed.numpy()
boxes = np.asarray(detections["detection_boxes"][0])
classes = np.asarray(detections["detection_classes"][0]).astype(np.int64)
scores = np.asarray(detections["detection_scores"][0])
mask = np.asarray(detections["detection_masks_reframed"])
# Visualizing the results
viz_utils.visualize_boxes_and_labels_on_image_array(image_np_with_detections,
boxes, classes, scores, category_index, instance_masks=mask,
use_normalized_coordinates=True,
line_thickness=3)
cv2_imshow(image_np_with_detections)
display(Image.fromarray(image_np_with_detections))
print("Done")TF2 ObjectDetectionAPI Inference.ipynb
https://colab.research.google.com/drive/1ossp9IRFgjIQ1-PbWqaChcgMfjDPGDyL?usp=sharing
Hichem Felouat - Algeria - hichemfel@gmail.com 99
Train TF OD API on a custom dataset
• The TF2 object detection API models expect the input
dataset in TFRecord format.
• The TFRecord format is a simple format for storing a
sequence of binary records.
• How to Create a TFRecord File for Computer Vision and
Object Detection? follow:
https://blog.roboflow.com/how-to-create-to-a-tfrecord-file-for-computer-vision/
Hichem Felouat - Algeria - hichemfel@gmail.com 100
Train TF OD API on a custom dataset
# Downloading data from Roboflow
# UPDATE THIS LINK - get our data from Roboflow
%mkdir /content/my_dataset/
%cd /content/my_dataset/
!curl -L "[YOUR LINK HERE]" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip
• After annotating the dataset, we can convert it form (xml, csv, or
any other format) to TFRecord format.
Hichem Felouat - Algeria - hichemfel@gmail.com 101
Train TF OD API on a custom dataset
# change chosen model to deploy different models available in the TF2 object detection zoo
# pretrained checkpoint :
# https://github.com/tensorflow/models/blob/master/research/object detection/g3doc/tf2 detection zoo.md
# base pipeline file:
# https://github.com/tensorflow/models/tree/master/research/object detection/configs/tf2
MODELS_CONFIG = {
'efficientdet_d0': {
'model_name': 'efficientdet_d0_coco17_tpu-32',
'base_pipeline_file': 'ssd_efficientdet_d0_512x512_coco17_tpu-8.config',
'pretrained_checkpoint': 'efficientdet_d0_coco17_tpu-32.tar.gz',
'batch_size': 8
},
'efficientdet_d7': {
'model_name': 'efficientdet_d7_coco17_tpu-32',
'base_pipeline_file': 'ssd_efficientdet_d7_1536x1536_coco17_tpu-32.config',
'pretrained_checkpoint': "efficientdet_d7_coco17_tpu-32.tar.gz",
'batch_size': 4
},
'Faster_RCNN': {
'model_name': 'faster_rcnn_resnet50_v1_640x640_coco17_tpu-8',
'base_pipeline_file': 'faster_rcnn_resnet50_v1_640x640_coco17_tpu-8.config',
'pretrained_checkpoint': 'faster_rcnn_resnet50_v1_640x640_coco17_tpu-8.tar.gz',
'batch_size': 4
}
}
Hichem Felouat - Algeria - hichemfel@gmail.com 102
Train TF OD API on a custom dataset
chosen_model = "Faster_RCNN"
num_steps = 2000 # The more steps, the longer the training. Increase if your
loss function is still decreasing and validation metrics are increasing.
num_eval_steps = 100 # Perform evaluation after so many steps
model_name = MODELS_CONFIG[chosen_model]['model_name']
pretrained_checkpoint =
MODELS_CONFIG[chosen_model]['pretrained_checkpoint']
base_pipeline_file = MODELS_CONFIG[chosen_model]['base_pipeline_file']
batch_size = MODELS_CONFIG[chosen_model]['batch_size']
Hichem Felouat - Algeria - hichemfel@gmail.com 103
Train TF OD API on a custom dataset
# Download pretrained weights
%mkdir /content/models/research/deploy/
%cd /content/models/research/deploy/
import tarfile
download_tar = 'http://download.tensorflow.org/models/object_detection/tf2/20200711/' +
pretrained_checkpoint
!wget {download_tar}
tar = tarfile.open(pretrained_checkpoint)
tar.extractall()
tar.close()
# Download base training configuration file
%cd /content/models/research/deploy
download_config =
'https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/configs/tf2/' +
base_pipeline_file
!wget {download_config}
Hichem Felouat - Algeria - hichemfel@gmail.com 104
Train TF OD API on a custom dataset
!python /content/models/research/object_detection/model_main_tf2.py 
--pipeline_config_path={pipeline_file} 
--model_dir={model_dir} 
--alsologtostderr 
--num_train_steps={num_steps} 
--sample_1_of_n_eval_examples=1 
--num_eval_steps={num_eval_steps}
Training Custom Object Detector:
https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html
Train Custom TensorFlow2 Object Detection Model:
https://colab.research.google.com/drive/1sLqFKVV94wm-lglFq_0kGo2ciM0kecWD#scrollTo=fF8ysCfYKgTP&uniqifier=1
Hichem Felouat - Algeria - hichemfel@gmail.com 105
TF OD API - Mask-RCNN on a custom
dataset
Dataset Gathering :
As always, at the beginning of an AI Project after the problem statement has
been identified we move on to gathering the data or in this case images for
training.
Data Labeling :
Now that we have gathered the dataset we need to label the images so that the
model understands what is the interesting object on the picture. To label the
images we need a labeling software, for example : labelme (instance
segmentation)
Hichem Felouat - Algeria - hichemfel@gmail.com 106
TF OD API - Mask-RCNN on a custom
dataset
labelme
Hichem Felouat - Algeria - hichemfel@gmail.com 107
TF OD API - Mask-RCNN on a custom
dataset
Create TFRecords :
• After labeling our entire dataset we now have to generate TFRecords which
serves as input for our model training. labelme offer us json files, so that we
need to convert the json labelme labels into COCO format.
• You can use labelme2coco package to convert labelme annotations to
COCO format. https://github.com/fcakyon/labelme2coco
• Now that the data is in COCO format we can create the TFRecord files. For
this, we will make use of the create coco tf record.py file.
Hichem Felouat - Algeria - hichemfel@gmail.com 108
TF OD API - Mask-RCNN on a custom
dataset
Follow the full code of Mask RCNN in my
Colab Notebook:
https://colab.research.google.com/drive/1mBw-dLLyM96N2KePnh9fE6Wwm4RWw9H9?usp=sharing
Hichem Felouat - Algeria - hichemfel@gmail.com 109
Thanks For
Your Attention
Hichem Felouat ...

More Related Content

What's hot

Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detectionWenjing Chen
 
A Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaA Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaPreferred Networks
 
You only look once: Unified, real-time object detection (UPC Reading Group)
You only look once: Unified, real-time object detection (UPC Reading Group)You only look once: Unified, real-time object detection (UPC Reading Group)
You only look once: Unified, real-time object detection (UPC Reading Group)Universitat Politècnica de Catalunya
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer VisionSungjoon Choi
 
Object Detection & Tracking
Object Detection & TrackingObject Detection & Tracking
Object Detection & TrackingAkshay Gujarathi
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Simplilearn
 
PR-284: End-to-End Object Detection with Transformers(DETR)
PR-284: End-to-End Object Detection with Transformers(DETR)PR-284: End-to-End Object Detection with Transformers(DETR)
PR-284: End-to-End Object Detection with Transformers(DETR)Jinwon Lee
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersSeunghyun Hwang
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learningSushant Shrivastava
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorJinwon Lee
 
Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421穗碧 陳
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationDat Nguyen
 
Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition Intel Nervana
 
Focal loss for dense object detection
Focal loss for dense object detectionFocal loss for dense object detection
Focal loss for dense object detectionDaeHeeKim31
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detectionAmar Jindal
 

What's hot (20)

Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detection
 
A Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaA Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi Kerola
 
Yolo
YoloYolo
Yolo
 
YOLO
YOLOYOLO
YOLO
 
You only look once: Unified, real-time object detection (UPC Reading Group)
You only look once: Unified, real-time object detection (UPC Reading Group)You only look once: Unified, real-time object detection (UPC Reading Group)
You only look once: Unified, real-time object detection (UPC Reading Group)
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
 
Object Detection & Tracking
Object Detection & TrackingObject Detection & Tracking
Object Detection & Tracking
 
Yolov5
Yolov5 Yolov5
Yolov5
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
 
PR-284: End-to-End Object Detection with Transformers(DETR)
PR-284: End-to-End Object Detection with Transformers(DETR)PR-284: End-to-End Object Detection with Transformers(DETR)
PR-284: End-to-End Object Detection with Transformers(DETR)
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
 
You only look once
You only look onceYou only look once
You only look once
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox Detector
 
Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
 
Object tracking
Object trackingObject tracking
Object tracking
 
Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition
 
Focal loss for dense object detection
Focal loss for dense object detectionFocal loss for dense object detection
Focal loss for dense object detection
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 

Similar to Object detection and Instance Segmentation

4_image_detection.pdf
4_image_detection.pdf4_image_detection.pdf
4_image_detection.pdfFEG
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用CHENHuiMei
 
auto-assistance system for visually impaired person
auto-assistance system for visually impaired personauto-assistance system for visually impaired person
auto-assistance system for visually impaired personshahsamkit73
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
 
Real-time Computer Vision With Ruby - OSCON 2008
Real-time Computer Vision With Ruby - OSCON 2008Real-time Computer Vision With Ruby - OSCON 2008
Real-time Computer Vision With Ruby - OSCON 2008Jan Wedekind
 
IRJET - Real Time Object Detection using YOLOv3
IRJET - Real Time Object Detection using YOLOv3IRJET - Real Time Object Detection using YOLOv3
IRJET - Real Time Object Detection using YOLOv3IRJET Journal
 
Human age and gender Detection
Human age and gender DetectionHuman age and gender Detection
Human age and gender DetectionAbhiAchalla
 
IRJET- Object Detection using Machine Learning Technique
IRJET- Object Detection using Machine Learning TechniqueIRJET- Object Detection using Machine Learning Technique
IRJET- Object Detection using Machine Learning TechniqueIRJET Journal
 
Software Architectures, Week 2 - Decomposition techniques
Software Architectures, Week 2 - Decomposition techniquesSoftware Architectures, Week 2 - Decomposition techniques
Software Architectures, Week 2 - Decomposition techniquesAngelos Kapsimanis
 
00463517b1e90c1e63000000
00463517b1e90c1e6300000000463517b1e90c1e63000000
00463517b1e90c1e63000000Ivonne Liu
 
Real Time Sign Language Recognition Using Deep Learning
Real Time Sign Language Recognition Using Deep LearningReal Time Sign Language Recognition Using Deep Learning
Real Time Sign Language Recognition Using Deep LearningIRJET Journal
 
Andrii Belas "Overview of object detection approaches: cases, algorithms and...
Andrii Belas  "Overview of object detection approaches: cases, algorithms and...Andrii Belas  "Overview of object detection approaches: cases, algorithms and...
Andrii Belas "Overview of object detection approaches: cases, algorithms and...Lviv Startup Club
 
Tackling Open Images Challenge (2019)
Tackling Open Images Challenge (2019)Tackling Open Images Challenge (2019)
Tackling Open Images Challenge (2019)Hiroto Honda
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonAditya Bhattacharya
 
People counting in low density video sequences2
People counting in low density video sequences2People counting in low density video sequences2
People counting in low density video sequences2Ahmed Tememe
 
BMVA summer school MATLAB programming tutorial
BMVA summer school MATLAB programming tutorialBMVA summer school MATLAB programming tutorial
BMVA summer school MATLAB programming tutorialpotaters
 
AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC...
AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC...AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC...
AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC...Amazon Web Services
 
Introduction To Generative Adversarial Networks GANs
Introduction To Generative Adversarial Networks GANsIntroduction To Generative Adversarial Networks GANs
Introduction To Generative Adversarial Networks GANsHichem Felouat
 

Similar to Object detection and Instance Segmentation (20)

20220811 - computer vision
20220811 - computer vision20220811 - computer vision
20220811 - computer vision
 
4_image_detection.pdf
4_image_detection.pdf4_image_detection.pdf
4_image_detection.pdf
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
auto-assistance system for visually impaired person
auto-assistance system for visually impaired personauto-assistance system for visually impaired person
auto-assistance system for visually impaired person
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
 
Real-time Computer Vision With Ruby - OSCON 2008
Real-time Computer Vision With Ruby - OSCON 2008Real-time Computer Vision With Ruby - OSCON 2008
Real-time Computer Vision With Ruby - OSCON 2008
 
IRJET - Real Time Object Detection using YOLOv3
IRJET - Real Time Object Detection using YOLOv3IRJET - Real Time Object Detection using YOLOv3
IRJET - Real Time Object Detection using YOLOv3
 
Human age and gender Detection
Human age and gender DetectionHuman age and gender Detection
Human age and gender Detection
 
IRJET- Object Detection using Machine Learning Technique
IRJET- Object Detection using Machine Learning TechniqueIRJET- Object Detection using Machine Learning Technique
IRJET- Object Detection using Machine Learning Technique
 
Software Architectures, Week 2 - Decomposition techniques
Software Architectures, Week 2 - Decomposition techniquesSoftware Architectures, Week 2 - Decomposition techniques
Software Architectures, Week 2 - Decomposition techniques
 
00463517b1e90c1e63000000
00463517b1e90c1e6300000000463517b1e90c1e63000000
00463517b1e90c1e63000000
 
Real Time Sign Language Recognition Using Deep Learning
Real Time Sign Language Recognition Using Deep LearningReal Time Sign Language Recognition Using Deep Learning
Real Time Sign Language Recognition Using Deep Learning
 
Andrii Belas "Overview of object detection approaches: cases, algorithms and...
Andrii Belas  "Overview of object detection approaches: cases, algorithms and...Andrii Belas  "Overview of object detection approaches: cases, algorithms and...
Andrii Belas "Overview of object detection approaches: cases, algorithms and...
 
Tackling Open Images Challenge (2019)
Tackling Open Images Challenge (2019)Tackling Open Images Challenge (2019)
Tackling Open Images Challenge (2019)
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathon
 
People counting in low density video sequences2
People counting in low density video sequences2People counting in low density video sequences2
People counting in low density video sequences2
 
BMVA summer school MATLAB programming tutorial
BMVA summer school MATLAB programming tutorialBMVA summer school MATLAB programming tutorial
BMVA summer school MATLAB programming tutorial
 
AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC...
AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC...AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC...
AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC...
 
Log polar coordinates
Log polar coordinatesLog polar coordinates
Log polar coordinates
 
Introduction To Generative Adversarial Networks GANs
Introduction To Generative Adversarial Networks GANsIntroduction To Generative Adversarial Networks GANs
Introduction To Generative Adversarial Networks GANs
 

More from Hichem Felouat

Natural Language Processing NLP (Transformers)
Natural Language Processing NLP (Transformers)Natural Language Processing NLP (Transformers)
Natural Language Processing NLP (Transformers)Hichem Felouat
 
The fundamentals of Machine Learning
The fundamentals of Machine LearningThe fundamentals of Machine Learning
The fundamentals of Machine LearningHichem Felouat
 
Artificial Intelligence and its Applications
Artificial Intelligence and its ApplicationsArtificial Intelligence and its Applications
Artificial Intelligence and its ApplicationsHichem Felouat
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Hichem Felouat
 
Predict future time series forecasting
Predict future time series forecastingPredict future time series forecasting
Predict future time series forecastingHichem Felouat
 
How to Build your First Neural Network
How to Build your First Neural NetworkHow to Build your First Neural Network
How to Build your First Neural NetworkHichem Felouat
 
Machine Learning Algorithms
Machine Learning AlgorithmsMachine Learning Algorithms
Machine Learning AlgorithmsHichem Felouat
 
Build your own Convolutional Neural Network CNN
Build your own Convolutional Neural Network CNNBuild your own Convolutional Neural Network CNN
Build your own Convolutional Neural Network CNNHichem Felouat
 

More from Hichem Felouat (9)

Natural Language Processing NLP (Transformers)
Natural Language Processing NLP (Transformers)Natural Language Processing NLP (Transformers)
Natural Language Processing NLP (Transformers)
 
The fundamentals of Machine Learning
The fundamentals of Machine LearningThe fundamentals of Machine Learning
The fundamentals of Machine Learning
 
Artificial Intelligence and its Applications
Artificial Intelligence and its ApplicationsArtificial Intelligence and its Applications
Artificial Intelligence and its Applications
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Predict future time series forecasting
Predict future time series forecastingPredict future time series forecasting
Predict future time series forecasting
 
Transfer Learning
Transfer LearningTransfer Learning
Transfer Learning
 
How to Build your First Neural Network
How to Build your First Neural NetworkHow to Build your First Neural Network
How to Build your First Neural Network
 
Machine Learning Algorithms
Machine Learning AlgorithmsMachine Learning Algorithms
Machine Learning Algorithms
 
Build your own Convolutional Neural Network CNN
Build your own Convolutional Neural Network CNNBuild your own Convolutional Neural Network CNN
Build your own Convolutional Neural Network CNN
 

Recently uploaded

Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGSIVASHANKAR N
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesPrabhanshu Chaturvedi
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 

Recently uploaded (20)

Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 

Object detection and Instance Segmentation

  • 1. Object Detection & Instance SegmentationYoloV5, Faster-RCNN, EfficientDet d7, Mask-RCNN, Detectron2, and TF2 Object Detection API Hichem Felouat - Algeria 2020 hichemfel@gmail.com https://www.researchgate.net/profile/Hichem_Felouat https://www.linkedin.com/in/hichemfelouat/
  • 2. Object Detection 2Hichem Felouat - Algeria - hichemfel@gmail.com • We all know about the image classification problem. Given an image, the Net finds out the class to which the image belongs. • Localizing an object in a picture means predicting a bounding box around the object and can be expressed as a regression task.
  • 3. 3Hichem Felouat - Algeria - hichemfel@gmail.com Object Detection Problem: the dataset does not have bounding boxes around the objects, how can we train our model? • We need to add them ourselves. This is often one of the hardest and most costly parts of a Machine Learning project: getting the labels. • It is a good idea to spend time looking for the right tools.
  • 4. 4Hichem Felouat - Algeria - hichemfel@gmail.com Image Labeling Tool An image labeling or annotation tool is used to label the images for bounding box object detection and segmentation. Open-source image labeling tool like : • VGG Image • Annotator • LabelImg • OpenLabeler • ImgLab Commercial tool like : • LabelBox • Supervisely Crowdsourcing Platform like : • Amazon Mechanical Turk
  • 5. 5Hichem Felouat - Algeria - hichemfel@gmail.com Image Labeling Tool - VGG Image VGG Image: http://www.robots.ox.ac.uk/~vgg/software/via/
  • 6. Hichem Felouat - Algeria - hichemfel@gmail.com 6 Image Labeling Tool - labelImg labelimg
  • 7. Hichem Felouat - Algeria - hichemfel@gmail.com 7 Object Detection - Notes • The bounding boxes should be normalized so that the horizontal and vertical coordinates, as well as the height and width, all range from 0 to 1. • It is common to predict the square root of the height and width rather than the height and width directly: this way, a 10-pixel error for a large bounding box will not be penalized as much as a 10-pixel error for a small bounding box.
  • 8. Hichem Felouat - Algeria - hichemfel@gmail.com 8 How to Evaluate Object Detection Model? • The MSE often works fairly well as a cost function to train the model, but it is not a great metric to evaluate how well the model can predict bounding boxes. • The most common metric for this is the Intersection over Union (IoU). • tf.keras.metrics.MeanIoU
  • 9. Hichem Felouat - Algeria - hichemfel@gmail.com 9 mean Average Precision (mAP) In order to calculate mAP, we draw a series of precision-recall curves with the IoU threshold set at varying levels of difficulty. In COCO evaluation, the IoU threshold ranges from 0.5 to 0.95 with a step size of 0.05 represented as AP@[.5:.05:.95] Evaluate Object Detection Model 2 mAP
  • 10. Hichem Felouat - Algeria - hichemfel@gmail.com 10 Evaluate Object Detection Model 2 mAP We draw these precision-recall curves for the dataset split out by class type (for example 3 classes and 6 threshold ranges).
  • 11. Hichem Felouat - Algeria - hichemfel@gmail.com 11 Evaluate Object Detection Model 2 mAP These precision and recall values are then plotted to get a PR (precision-recall) curve. The area under the PR curve is called Average Precision (AP). • For each class, calculate AP at different IoU thresholds and take their average to get the AP of that class. Exp: in PASCAL VOC challenge 2007, it is defined as the mean of precision values at a set of 11 equally spaced recall levels [0,0.1,…,1] (0 to 1 at step size of 0.1). Object detection metrics https://github.com/rafaelpadilla/Object-Detection-Metrics
  • 12. Hichem Felouat - Algeria - hichemfel@gmail.com 12 Evaluate Object Detection Model 2 mAP • Calculate the final AP by averaging the AP over different classes. mAP-IoU thresholds mAP - example COCO dataset
  • 13. 13Hichem Felouat - Algeria - hichemfel@gmail.com Object Detection - Example (C, X,Y, W, H) raw image 416*416 image labeling (C, X,Y, W, H) image result model • Each item should be a tuple of the form : (images, (class labels, bounding boxes) )
  • 14. 14Hichem Felouat - Algeria - hichemfel@gmail.com Object Detection - Example Full code: https://www.kaggle.com/hichemfelouat/braintumorlocalization
  • 15. 15Hichem Felouat - Algeria - hichemfel@gmail.com Object Detection - Exp TL Full code: https://www.kaggle.com/hichemfelouat/braintumorlocalization
  • 16. Hichem Felouat - Algeria - hichemfel@gmail.com 16 Multiple Objects Detection The task of classifying and localizing multiple objects in an image is called object detection. Detecting multiple objects by sliding a CNN across the image.
  • 17. Hichem Felouat - Algeria - hichemfel@gmail.com 17 Multiple Objects Detection 1. You need to add an extra objectness output to your CNN, to estimate the probability that an object is indeed present in the image. 2. Find the bounding box with the highest objectness score, and get rid of all the other bounding boxes that overlap a lot with it (IoU). 3. Repeat step two until there are no more bounding boxes to get rid of.
  • 18. Hichem Felouat - Algeria - hichemfel@gmail.com 18 Multiple Objects Detection In general, object detectors have three (3) main components: 1) The backbone that extracts features from the given image. 2) The feature network that takes multiple levels of features from the backbone as input and outputs a list of fused features that represent salient characteristics of the image. 3) The final class/box network that uses the fused features to predict the class and location of each object.
  • 19. Hichem Felouat - Algeria - hichemfel@gmail.com 19 What is Yolo? • You Only Look Once (YOLO) is an algorithm that uses convolutional neural networks for object detection. • It is one of the faster object detection algorithms out there. • It is a very good choice when we need real-time detection, without loss of too much accuracy. You Only Look Once: Unified, Real-Time Object Detection https://arxiv.org/abs/1506.02640 YOLOV3: https://arxiv.org/abs/1804.02767 YOLOV4: https://arxiv.org/abs/2004.10934
  • 20. Hichem Felouat - Algeria - hichemfel@gmail.com 20 What is Yolo? YoloV4 architecture
  • 21. Hichem Felouat - Algeria - hichemfel@gmail.com 21 What is Yolo? Backbone refers to the feature-extraction architecture. Tiny YOLO has only 9 convolutional layers, so it’s less accurate but faster and better suited for mobile and embedded projects. Darknet53 ( The backbone used in YOLOV3) has 53 convolutional layers, so it’s more accurate but slower. In YOLOv4, backbones can be VGG, ResNet, SpineNet, EfficientNet, ResNeXt, or Darknet53.
  • 22. Hichem Felouat - Algeria - hichemfel@gmail.com 22 What is Yolo? Neck additional layers between the backbone and the head (dense prediction block), its purpose is to add extra information in the layers. ( Feature Pyramid Networks, Path Aggregation Network ...). Head The head block is the part used to: locate bounding boxes and classify what’s inside each box. Sparse Prediction used in two-stage-detection algorithms.
  • 23. Hichem Felouat - Algeria - hichemfel@gmail.com 23 Interpreting The YOLO Output • The input is a batch of images of shape (m, 608, 608, 3). • The output is a list of bounding boxes along with the recognized classes. Each bounding box is represented by 6 numbers (pc, bx, by, bh, bw, c). If we expand c into an 80-dimensional vector (The number of classes), each bounding box is then represented by 85 numbers. • To do that, we divide the input image into a grid of dimensions equal to that of the final feature map. For example, the input image is 608 x 608, and the dimensions of the feature map are 19 x 19. So the stride of the network will be 32 (608/19=32).
  • 24. Hichem Felouat - Algeria - hichemfel@gmail.com 24 Interpreting The YOLO Output • IMAGE (m, 608, 608, 3) -> DEEP CNN -> ENCODING (m, 19, 19, 5, 85). • We are using 5 anchor boxes, each of the 19x19 cells thus encodes information about 5 boxes (YOLOV3).
  • 25. Hichem Felouat - Algeria - hichemfel@gmail.com 25 Interpreting The YOLO Output • Now, for each box (of each cell) we will compute the following elementwise product and extract a probability that the box contains a certain class.
  • 26. Hichem Felouat - Algeria - hichemfel@gmail.com 26 What does YOLO Predict on an Image
  • 27. Hichem Felouat - Algeria - hichemfel@gmail.com 27 Output Processing We will carry out these steps: • Get rid of boxes with a low score (meaning, the box is not very confident about detecting a class (exp: pc < 0.5)). • Select only one box when several boxes overlap with each other and detect the same object (Non-max suppression).
  • 28. Hichem Felouat - Algeria - hichemfel@gmail.com 28 YOLOV5 • YOLOv5 was released by Glenn Jocher on June 9, 2020. It follows the recent releases of YOLOv4 (April 23, 2020). • YOLOv5 is smaller and generally easier to use in production. Given it is natively implemented in PyTorch (rather than Darknet), modifying the architecture and exporting to many deploy environments is straightforward. • YOLOv5s is about 88% smaller than big-YOLOv4 (27 MB vs 244 MB). ultralytics /yolov5 : https://github.com/ultralytics/yolov5
  • 29. Hichem Felouat - Algeria - hichemfel@gmail.com 29 YOLOV5
  • 30. Hichem Felouat - Algeria - hichemfel@gmail.com 30 YOLOV5
  • 31. Hichem Felouat - Algeria - hichemfel@gmail.com 31 Inference - Google Colab # YOLOV5 !git clone https://github.com/ultralytics/yolov5 # clone repo !pip install -U -r yolov5/requirements.txt # install dependencies %cd /content/yolov5 # ****************************************************************************** !pip install -r requirements.txt # install dependencies %cd yolov5
  • 32. Hichem Felouat - Algeria - hichemfel@gmail.com 32 Inference - Google Colab
  • 33. Hichem Felouat - Algeria - hichemfel@gmail.com 33 Inference - Google Colab Inference can be run on most common media formats. Model checkpoints are downloaded automatically if available. Results are saved to ./inference/output. !python detect.py --source 0 # webcam file.jpg # image file.mp4 # video path/ # directory path/*.jpg # glob rtsp://170.93.143.139/rtplive/470011e600ef003a004ee33696235daa # rtsp stream http://112.50.243.8/PLTV/88888888/224/3221225900/1.m3u8 # http stream ultralytics /yolov5 : https://github.com/ultralytics/yolov5
  • 34. Hichem Felouat - Algeria - hichemfel@gmail.com 34 Inference - Example # Inference %cd /content/yolov5 #!python detect.py --source /content/gdrive/My Drive/hichem data/img.jpg --weights yolov5x.pt --conf 0.4 # --weights (yolov5s.pt, yolov5m.pt, yolov5l.pt, yolov5x.pt) !python detect.py --source inference/images/zidane.jpg --weights yolov5l.pt --conf 0.4 import torch from IPython.display import Image, clear output # to display images Image(filename='inference/output/zidane.jpg', width=600) # Image(filename='inference/output/bus.jpg', width=600
  • 35. Hichem Felouat - Algeria - hichemfel@gmail.com 35 Inference - Example
  • 36. Hichem Felouat - Algeria - hichemfel@gmail.com 36 Custom object detection data in YOLOV5 format - image public blood cell detection dataset Train YOLOV5 On a Custom Dataset
  • 37. Hichem Felouat - Algeria - hichemfel@gmail.com 37 Custom object detection data in YOLOv5 format - label Label: class bx by bh bw.txt Train YOLOV5 On a Custom Dataset
  • 38. Hichem Felouat - Algeria - hichemfel@gmail.com 38 Train YOLOV5 On a Custom Dataset Download custom object detection data in YOLOV5 format from Roboflow. # Download Custom Dataset %mkdir /content/my dataset/ %cd /content/my dataset !curl -L "YOUR LINK HERE" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip Roboflow simplifies your computer workflow from data organization, annotation verification, preprocessing, augmentations to exporting to your required model format. https://app.roboflow.ai/login Getting Started with Roboflow : https://blog.roboflow.com/getting-started-with-roboflow/
  • 39. Hichem Felouat - Algeria - hichemfel@gmail.com 39 Use the "YOLOv5 PyTorch" export format. Note that the Ultralytics implementation calls for a YAML file defining where your training and test data is. The Roboflow export also writes this format for us. Train YOLOV5 On a Custom Dataset
  • 40. Hichem Felouat - Algeria - hichemfel@gmail.com 40 Define Model Configuration and Architecture data.yaml file is for specifying the location of a YOLOv5 images folder, a YOLOv5 labels folder, and information on our custom classes.
  • 41. Hichem Felouat - Algeria - hichemfel@gmail.com 41 Define Model Configuration and Architecture My dataset Test set Test images Test labels data.yaml label An example of how to organize a dataset
  • 42. Hichem Felouat - Algeria - hichemfel@gmail.com 42 Define Model Configuration and Architecture 1) W e w i l l w r i t e a n e w y a m l s c r i p t f i l e ( e x p : my model l.ymal) that defines the parameters for our model like the number of classes, anchors, and each layer. 2) Copy then paste the script from one of the files (yolov5s, m, l, x) into your ymal script file (exp: my model l.ymal). 3) Update the parameters.
  • 43. Hichem Felouat - Algeria - hichemfel@gmail.com 43 Define Model Configuration and Architecture
  • 44. Hichem Felouat - Algeria - hichemfel@gmail.com 44 Train YoloV5 on Custom Dataset # Train Yolov5 on Custom Dataset # weights : # https://drive.google.com/drive/folders/1Drs_Aiu7xx6S- ix95f9kNsA6ueKRpN2J %%time %cd /content/yolov5/ !python train.py --img 416 --batch 16 --epochs 200 --data /content/my_dataset/data.yaml --cfg models/my_model_l.yaml --weights yolov5l.pt --name my_model_l_results --cache
  • 45. Hichem Felouat - Algeria - hichemfel@gmail.com 45 Train YoloV5 on Custom Dataset • img: define input image size • batch: determine batch size • epochs: define the number of training epochs. (Note: often, 3000+ are common here!) • data: set the path to our yaml file • cfg: specify our model configuration • weights: specify a custom path to weights. (Note: you can download weights from the Ultralytics Google Drive folder) • name: result names • nosave: only save the final checkpoint • cache: cache images for faster training
  • 46. Hichem Felouat - Algeria - hichemfel@gmail.com 46 Evaluate Custom YOLOV5 Detector Performance # Evaluate Custom YOLOV5 Detector Performance %load ext tensorboard %tensorboard --logdir runs # we can also output some older school graphs if the tensor board isn't working for whatever reason... Image(filename='/content/yolov5/runs/exp0 my model l results/results.png', width=1000) # view results.png
  • 47. Hichem Felouat - Algeria - hichemfel@gmail.com 47 Run Inference With Trained Weights # Inference 1 image : %cd /content/yolov5/ !python detect.py --weights /content/yolov5/runs/exp0_my_model_l_results/weights/best.pt -- img 416 --conf 0.5 --source /content/my_dataset/test/images/BloodImage_00038_jpg.rf.63da20f3f5538d0d2be8c4633c703 4a1.jpg # display 1 image : import torch from IPython.display import Image, clear_output # to display images Image(filename='/content/yolov5/inference/output/BloodImage_00038_jpg.rf.63da20f3f5538d0d 2be8c4633c7034a1.jpg', width=600)
  • 48. Hichem Felouat - Algeria - hichemfel@gmail.com 48 Run Inference With Trained Weights # Inference all test images : %cd /content/yolov5/ !python detect.py --weights /content/yolov5/runs/exp0_my_model_l_results/weights/best.pt --img 416 --conf 0.5 --source /content/my_dataset/test/images #display inference on ALL test images import glob from IPython.display import Image, display for imageName in glob.glob('/content/yolov5/inference/output/*.jpg'): display(Image(filename=imageName)) print("n")
  • 49. Hichem Felouat - Algeria - hichemfel@gmail.com 49 Export Saved YOLOV5 Weights for Future Inference # Export Saved YOLOv5 Weights for Future Inference from google.colab import drive drive.mount('/content/gdrive') %cp /content/yolov5/runs/exp2 my model results/weights/best my model results.pt /content/gdrive/My Drive We recommend following the full code in this YOLOv5 Colab Notebook: https://colab.research.google.com/drive/1gDZ2xcTOgR39tGGs-EZ6i3RTs16wmzZQ How to Train YOLOv5 On a Custom Dataset: https://blog.roboflow.com/how-to-train-yolov5-on-a-custom-dataset/
  • 50. Hichem Felouat - Algeria - hichemfel@gmail.com 50 Faster R-CNN • Faster R-CNN (Faster Region-based Convolutional Neural Network) is now a canonical model for deep learning-based object detection. It helped inspire many detection and segmentation models that came after it. • Faster R-CNN was originally published in NIPS 2015. • We can not understand Faster R-CNN without understanding its own predecessors, R-CNN and Fast R-CNN.
  • 51. Hichem Felouat - Algeria - hichemfel@gmail.com 51 R-CNN R-CNN (Region-based Convolutional Neural Network) (2014): • Scan the input image for possible objects using an algorithm called Selective Search, generating ~2000 region proposals. • Run a convolutional neural net (CNN) on top of each of these region proposals. • Take the output of each CNN and feed it into a) an SVM to classify the region and b) a linear regressor to tighten the bounding box of the object if such an object exists. Selective Search for Object Recognition: https://link.springer.com/article/10.1007%252Fs11263-013-0620-5 R-CNN: https://arxiv.org/abs/1311.2524
  • 52. Hichem Felouat - Algeria - hichemfel@gmail.com 52 R-CNN
  • 53. Hichem Felouat - Algeria - hichemfel@gmail.com 53 Fast R-CNN The purpose of the Fast Region-based Convolutional Network (Fast R-CNN) is to reduce the time consumption related to the high number of models necessary to analyze all region proposals. • Fast R-CNN developed by Ross Girshick in 2015. • Performing feature extraction over the image before proposing regions, thus only running one CNN over the entire image instead of 2000 CNN’s over 2000 overlapping regions. • Replacing the SVM with a softmax layer, thus extending the neural network for predictions instead of creating a new model. Fast R-CNN: https://arxiv.org/abs/1504.08083
  • 54. Hichem Felouat - Algeria - hichemfel@gmail.com 54 Fast R-CNN
  • 55. Hichem Felouat - Algeria - hichemfel@gmail.com 55 Faster R-CNN • The input image is passed through a pre-trained CNN (The original Faster R-CNN used VGG), then we use the convolutional feature map we get for the next part. • Next, the Region Proposal Network (RPN) uses the features that the CNN computed to find up to a predefined number of regions (bounding boxes), which may contain objects. It generates multiple possible regions based on k fixed-ratio anchor boxes (default bounding boxes). RPN accelerates the training and testing processes. • Finally, comes the R-CNN module, which uses that information to: Classify the content in the bounding box (or discard it, using “background” as a label). Adjust the bounding box coordinates (so it better fits the object).
  • 56. Hichem Felouat - Algeria - hichemfel@gmail.com 56 Faster R-CNN Faster R-CNN: Down the rabbit hole of modern object detection https://tryolabs.com/blog/2018/01/18/faster-r-cnn-down-the-rabbit-hole-of-modern-object-detection/ Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks https://arxiv.org/abs/1506.01497
  • 57. Hichem Felouat - Algeria - hichemfel@gmail.com 57 • EfficientDets are a family of object detector models that are based on EfficientNet and is reportedly much efficient than other states of the art models. • In the world of object detection balancing the trade-off between accuracy and performance efficiency of the detectors is a major challenge; EfficientDet attempts to minimize the trade-off and give the best detector both in terms of accuracy and performance. • EfficientDet-D7 achieves state-of-the-art 51.0 mAP on COCO dataset with 52M parameters and 326B FLOPS1 , being 4x smaller and using 9.3x fewer FLOPS. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks: https://arxiv.org/abs/1905.11946 EfficientDet: Scalable and Efficient Object Detection : https://arxiv.org/abs/1911.09070 Github: https://github.com/google/automl/tree/master/efficientdet
  • 58. Hichem Felouat - Algeria - hichemfel@gmail.com 58 EfficientDet architecture. EfficientDet uses EfficientNet as the backbone network and a newly proposed bi- directional feature pyramid network (BiFPN) feature network.
  • 59. Hichem Felouat - Algeria - hichemfel@gmail.com 59
  • 60. Hichem Felouat - Algeria - hichemfel@gmail.com 60 Instance Segmentation Instance Segmentation aims to predicting the object class-label and the pixel-specific object instance-mask, it localizes different classes of object instances present in various images. Instance segmentation aims to help largely robotics, autonomous driving, surveillance, etc. • Classification: There is a balloon in this image. • Semantic Segmentation: These are all the balloon pixels. • Object Detection: There are 7 balloons in this image at these locations. We’re starting to account for objects that overlap. • Instance Segmentation: There are 7 balloons at these locations, and these are the pixels that belong to each one. A Survey on Instance Segmentation: State of the art https://arxiv.org/abs/2007.00047
  • 61. Hichem Felouat - Algeria - hichemfel@gmail.com 61 Mask R-CNN • Mask R-CNN (Mask Regional Convolutional Neural Network) is a state of the art model for instance segmentation, developed on top of Faster R-CNN. • For a given image, Mask R-CNN, in addition to the class label and bounding box coordinates for each object, will also return the object mask. • The Mask R-CNN model introduced in the 2018 paper titled “Mask R-CNN”. Mask R-CNN: https://arxiv.org/abs/1703.06870
  • 62. Hichem Felouat - Algeria - hichemfel@gmail.com 62 Mask R-CNN
  • 63. Hichem Felouat - Algeria - hichemfel@gmail.com 63 Detectron • Detectron2 was built by Facebook AI Research (FAIR) to support rapid implementation and evaluation of novel computer vision research. • Detectron2 is now implemented in PyTorch. • Detectron2 is flexible and extensible, and able to provide fast training on single or multiple GPU servers. • Detectron2 can be used as a library to support different projects on top of it. Detectron2: A PyTorch-based modular object detection library https://ai.facebook.com/blog/-detectron2-a-pytorch-based-modular-object-detection-library-/ Detectron2 : https://github.com/facebookresearch/detectron2?fbclid=IwAR2CdXQoTU9i-ebKPZIc7BQw8R6NKgp0B- yUkGr1BF3w1VKWzNhxFHi6Zbw detectron2’s documentation: https://detectron2.readthedocs.io/
  • 64. Hichem Felouat - Algeria - hichemfel@gmail.com 64 Detectron Model Zoo
  • 65. Hichem Felouat - Algeria - hichemfel@gmail.com 65 Detectron Model Zoo Detectron2 Model Zoo : https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md
  • 66. Hichem Felouat - Algeria - hichemfel@gmail.com 66 Detectron - Google Colab # install dependencies: !pip install pyyaml==5.1 pycocotools>=2.0.1 import torch, torchvision print(torch. version , torch.cuda.is available()) !gcc --version # install detectron2: (Colab has CUDA 10.1 + torch 1.6) # See https://detectron2.readthedocs.io/tutorials/install.html for instructions assert torch. version .startswith("1.6") !pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.6/index.html
  • 67. Hichem Felouat - Algeria - hichemfel@gmail.com 67 Detectron - Inference # Some basic setup: # Setup detectron2 logger import detectron2 from detectron2.utils.logger import setup_logger setup_logger() # import some common libraries import numpy as np import os, json, cv2, random from google.colab.patches import cv2_imshow # import some common detectron2 utilities from detectron2 import model_zoo from detectron2.engine import DefaultPredictor from detectron2.config import get_cfg from detectron2.utils.visualizer import Visualizer from detectron2.data import MetadataCatalog, DatasetCatalog # Load an image : img = cv2.imread("/content/img.png") cv2_imshow(img) Detectron2 Beginner's Tutorial : Colab https://colab.research.google.com/drive/16jcaJoc6bCF AQ96jDe2HwtXj7BMD_-m5
  • 68. Hichem Felouat - Algeria - hichemfel@gmail.com 68 Detectron - Inference # MODEL_ZOO : Choose the link of the algorithm https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md # faster_rcnn model_link1 = "COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x.yaml" # mask_rcnn model_link2 = "COCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x.yaml" # keypoint_rcnn model_link3 = "COCO-Keypoints/keypoint_rcnn_X_101_32x8d_FPN_3x.yaml" # Run a pre-trained detectron2 model : # we create a detectron2 config and a detectron2 DefaultPredictor to run inference on this image. cfg = get_cfg() # add project-specific config (e.g., TensorMask) here if you're not running a model in detectron2's core library cfg.merge_from_file(model_zoo.get_config_file(model_link2)) cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 # set threshold for this model # Find a model from detectron2's model zoo. You can use the https://dl.fbaipublicfiles... url as well cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(model_link2) predictor = DefaultPredictor(cfg) outputs = predictor(img)
  • 69. Hichem Felouat - Algeria - hichemfel@gmail.com 69 Detectron - Inference # Show results print("-----------------------") print("outputs : n",outputs) print("-----------------------") print("pred_classes : n",outputs["instances"].pred_classes) print("pred_boxes : n",outputs["instances"].pred_boxes) # pred_masks for mask_rcnn : outputs["instances"].pred_masks # We can use `Visualizer` to draw the predictions on the image. vis = Visualizer(img[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2) out = vis.draw_instance_predictions(outputs["instances"].to("cpu")) cv2_imshow(out.get_image()[:, :, ::-1]) """ # PanopticSegmentation model_link = "COCO-PanopticSegmentation/panoptic_fpn_R_101_3x.yaml" . . predictor = DefaultPredictor(cfg) panoptic_seg, segments_info = predictor(im)["panoptic_seg"] v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2) out = v.draw_panoptic_seg_predictions(panoptic_seg.to("cpu"), segments_info) cv2_imshow(out.get_image()[:, :, ::-1]) """
  • 70. Hichem Felouat - Algeria - hichemfel@gmail.com 70 Detectron - Inference
  • 71. Hichem Felouat - Algeria - hichemfel@gmail.com 71 Train Detectron on a custom dataset Convert your data-set to COCO-format : The COCO dataset is formatted in JSON and has five annotation types: object detection, keypoint detection, stuff segmentation, panoptic segmentation, and image captioning. It is a collection of “info”, “licenses”, “images”, “annotations”, “categories” (in most cases), and “segment info” (in one case). • If your dataset is not in coco format, you have to convert it.
  • 72. Hichem Felouat - Algeria - hichemfel@gmail.com 72 Train Detectron on a custom dataset{ "info": { "year": "2020", "version": "1", "description": "Exported from roboflow.ai", "contributor": "Roboflow", "url": "https://public.roboflow.ai/object-detection/bccd", "date_created": "2020-03-16T01:31:26+00:00" }, "licenses": [ { "id": 1, "url": "https://creativecommons.org/publicdomain/zero/1.0/", "name": "Public Domain" } ], "categories": [ { "id": 0, "name": "cells", "supercategory": "none" }, { "id": 1, "name": "Platelets", "supercategory": "cells" }, { "id": 2, "name": "RBC", "supercategory": "cells" }, { "id": 3, "name": "WBC", "supercategory": "cells" } ], "images": [ { "id": 0, "license": 1, "file_name": "BloodImage_00240_jpg.rf.02e03d6cdaa6a0c4446fbede5b024dd2.jpg", "height": 416, "width": 416, "date_captured": "2020-03-16T01:31:26+00:00" }, "annotations": [ { "id": 0, "image_id": 0, "category_id": 2, "bbox": [ 242, 240, 67.60000000000001, 93.60000000000001 ], "area": 6327.3600000000015, "segmentation": [], "iscrowd": 0 } ] } COCO-format
  • 73. Hichem Felouat - Algeria - hichemfel@gmail.com 73 Train Detectron on a custom dataset Import and Register Custom Detectron2 Data in COCO JSON format From Roboflow : # Download Custom Dataset # COCO JSON %mkdir /content/my dataset/ %cd /content/my dataset !curl -L "{YOUR LINK HERE}" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip Full code : https://colab.research.google.com/drive/1-TNOcPm3Jr3fOJG8rnGT9gh60mHUsvaW#scrollTo=zVBjf0DE7HEW
  • 74. Hichem Felouat - Algeria - hichemfel@gmail.com 74 Train Detectron on a custom dataset Public blood cell dataset
  • 75. Hichem Felouat - Algeria - hichemfel@gmail.com 75 Train Detectron on a custom dataset • Detectron2 keeps track of a list of available datasets in a registry, so we must register our custom data with Detectron2 so it can be invoked for training. # Register Custom Detectron2 Data from detectron2.data.datasets import register_coco_instances register_coco_instances("my_dataset_train", {}, "/content/my_dataset/train/_annotations.coco.json", "/content/my_dataset/train") register_coco_instances("my_dataset_val", {}, "/content/my_dataset/valid/_annotations.coco.json", "/content/my_dataset/valid") register_coco_instances("my_dataset_test", {}, "/content/my_dataset/test/_annotations.coco.json", "/content/my_dataset/test")
  • 76. Hichem Felouat - Algeria - hichemfel@gmail.com 76 Train Detectron on a custom dataset • If you want to use a custom dataset while also reusing detectron2’s data loaders, you will need to: Register your dataset (i.e., tell detectron2 how to obtain your dataset). Optionally, register metadata for your dataset. • To let detectron2 know how to obtain a dataset named “my dataset”, users need to implement a function that returns the items in your dataset and then tell detectron2 about this function. Or you can convert your annotation to COCO format. Detectron2 custom dataset tutorial: https://detectron2.readthedocs.io/tutorials/datasets.html
  • 77. Hichem Felouat - Algeria - hichemfel@gmail.com 77 Train Detectron on a custom dataset # visualize training data my_dataset_train_metadata = MetadataCatalog.get("my_dataset_train") dataset_dicts = DatasetCatalog.get("my_dataset_train") import random from detectron2.utils.visualizer import Visualizer for d in random.sample(dataset_dicts, 3): img = cv2.imread(d["file_name"]) visualizer = Visualizer(img[:, :, ::-1], metadata=my_dataset_train_metadata, scale=0.5) out = visualizer.draw_dataset_dict(d) cv2_imshow(out.get_image()[:, :, ::-1]) print("n")
  • 78. Hichem Felouat - Algeria - hichemfel@gmail.com 78 Train Detectron on a custom dataset # We are importing our own Trainer Module here to use the COCO validation # evaluation during training. Otherwise no validation eval occurs. from detectron2.engine import DefaultTrainer from detectron2.evaluation import COCOEvaluator class CocoTrainer(DefaultTrainer): @classmethod def build_evaluator(cls, cfg, dataset_name, output_folder=None): if output_folder is None: os.makedirs("coco_eval", exist_ok=True) output_folder = "coco_eval" return COCOEvaluator(dataset_name, cfg, False, output_folder)
  • 79. Hichem Felouat - Algeria - hichemfel@gmail.com 79 Train Detectron on a custom dataset model_link = "COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x.yaml" cfg = get_cfg() cfg.merge_from_file(model_zoo.get_config_file(model_link)) cfg.DATASETS.TRAIN = ("my_dataset_train",) cfg.DATASETS.TEST = ("my_dataset_val",) cfg.DATALOADER.NUM_WORKERS = 4 # Number of data loading threads. cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(model_link) # Let training initialize from model zoo. cfg.SOLVER.IMS_PER_BATCH = 4 # Number of images per batch across all machines. cfg.SOLVER.BASE_LR = 0.001 cfg.SOLVER.WARMUP_ITERS = 1000 cfg.SOLVER.MAX_ITER = 1500 # Adjust up if val mAP is still rising, adjust down if overfit. cfg.SOLVER.STEPS = (1000, 1500) # The iteration number to decrease learning rate by GAMMA. cfg.SOLVER.GAMMA = 0.05 cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 64 cfg.MODEL.ROI_HEADS.NUM_CLASSES = 4 # Your number of classes cfg.TEST.EVAL_PERIOD = 100 # The period (in terms of steps) to evaluate the model during training. os.makedirs(cfg.OUTPUT_DIR, exist_ok=True) trainer = CocoTrainer(cfg) trainer.resume_or_load(resume=False) trainer.train() Detectron2.config package https://detectron2.readthedocs.io/modules/config.html
  • 80. Hichem Felouat - Algeria - hichemfel@gmail.com 80 Train Detectron on a custom dataset # Test evaluation from detectron2.data import DatasetCatalog, MetadataCatalog, build_detection_test_loader from detectron2.evaluation import COCOEvaluator, inference_on_dataset cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth") cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.85 predictor = DefaultPredictor(cfg) evaluator = COCOEvaluator("my_dataset_test", cfg, False, output_dir="./output/") val_loader = build_detection_test_loader(cfg, "my_dataset_test") inference_on_dataset(trainer.model, val_loader, evaluator) # Look at training curves in tensorboard %load_ext tensorboard %tensorboard --logdir output
  • 81. Hichem Felouat - Algeria - hichemfel@gmail.com 81 Train Detectron on a custom dataset # Inference & evaluation using the trained model # cfg already contains everything we've set previously. Now we changed it a little bit for inference: cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth") cfg.DATASETS.TEST = ("my_dataset_test", ) cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7 # set the testing threshold for this model predictor = DefaultPredictor(cfg) test_metadata = MetadataCatalog.get("my_dataset_test") from detectron2.utils.visualizer import ColorMode import glob for imageName in glob.glob("/content/my_dataset/test/*jpg")): img = cv2.imread(imageName) outputs = predictor(img) v = Visualizer(img[:, :, ::-1], metadata=test_metadata, scale=0.8 ) out = v.draw_instance_predictions(outputs["instances"].to("cpu")) cv2_imshow(out.get_image()[:, :, ::-1]) print("n")
  • 82. Hichem Felouat - Algeria - hichemfel@gmail.com 82 Train Detectron - Save/Load Trained Model # Save model final.pth %cp /content/my dataset/output/model final.pth /content/gdrive/My Drive my cfg = get cfg() my cfg.merge from file(model zoo.get config file("COCO-Detection/faster rcnn X 101 32x8d FPN 3x.yaml")) my cfg.MODEL.ROI HEADS.SCORE THRESH TEST = 0.5 my cfg.MODEL.WEIGHTS = "/content/gdrive/My Drive/model final.pth" my cfg.MODEL.ROI HEADS.NUM CLASSES = 4 my cfg.DATASETS.TEST = ("my dataset test", ) my model detectron2 = DefaultPredictor(my cfg) test metadata = MetadataCatalog.get("my dataset test") from detectron2.utils.visualizer import ColorMode import glob for imageName in glob.glob("/content/my dataset/test/*jpg"): im = cv2.imread(imageName) outputs = my model detectron2(im) print("Classes : n", outputs["instances"].pred classes) print("Scores : n", outputs["instances"].scores) v = Visualizer(im[:, :, ::-1], metadata=test metadata, scale=0.8 ) out = v.draw instance predictions(outputs["instances"].to("cpu")) cv2 imshow(out.get image()[:, :, ::-1]) print("n")
  • 83. Hichem Felouat - Algeria - hichemfel@gmail.com 83 Instance Segmentation on a custom dataset # Download, decompress the data %mkdir /content/my dataset/ %cd /content/my dataset !wget https://github.com/TannerGilbert/Detectron2-Train-a-Instance-Segmentation-Model/raw/master/microcontroller segmentation data.zip !unzip microcontroller segmentation data.zip Full code : https://github.com/TannerGilbert/Detectron2-Train-a-Instance-Segmentation-Model
  • 84. Hichem Felouat - Algeria - hichemfel@gmail.com 84 Instance Segmentation on a custom dataset Here, the dataset is in its custom format, therefore we write a function to parse it and prepare it into detectron2's standard format. User should write such a function when using a dataset in custom format (get my dataset dicts). See the tutorial for more details. https://detectron2.readthedocs.io/tutorials/datasets.html
  • 85. Hichem Felouat - Algeria - hichemfel@gmail.com 85 Instance Segmentation on a custom datasetfrom detectron2.structures import BoxMode def get_my_dataset_dicts(directory): classes = ['Raspberry_Pi_3', 'Arduino_Nano', 'ESP8266', 'Heltec_ESP32_Lora'] dataset_dicts = [] for filename in [file for file in os.listdir(directory) if file.endswith('.json')]: json_file = os.path.join(directory, filename) with open(json_file) as f: img_anns = json.load(f) record = {} filename = os.path.join(directory, img_anns["imagePath"]) record["file_name"] = filename record["height"] = 600 record["width"] = 800 annos = img_anns["shapes"] objs = [] for anno in annos: px = [a[0] for a in anno['points']] py = [a[1] for a in anno['points']] poly = [(x, y) for x, y in zip(px, py)] poly = [p for x in poly for p in x] obj = { "bbox": [np.min(px), np.min(py), np.max(px), np.max(py)], "bbox_mode": BoxMode.XYXY_ABS, "segmentation": [poly], "category_id": classes.index(anno['label']), "iscrowd": 0 } objs.append(obj) record["annotations"] = objs dataset_dicts.append(record) return dataset_dicts
  • 86. Hichem Felouat - Algeria - hichemfel@gmail.com 86 Instance Segmentation on a custom dataset from detectron2.data import DatasetCatalog, MetadataCatalog for d in ["train", "test"]: DatasetCatalog.register("my_dataset_" + d, lambda d=d: get_my_dataset_dicts("/content/my_dataset/Microcontroller Segmentation/" + d)) MetadataCatalog.get("my_dataset_" + d).set(thing_classes=["Raspberry_Pi_3", "Arduino_Nano", "ESP8266", "Heltec_ESP32_Lora"]) my_dataset_metadata = MetadataCatalog.get("my_dataset_train") # visualize training data import random dataset_dicts = get_my_dataset_dicts("/content/my_ataset/Microdcontroller Segmentation/train") for d in random.sample(dataset_dicts, 3): img = cv2.imread(d["file_name"]) visualizer = Visualizer(img[:, :, ::-1], metadata=my_dataset_metadata, scale=0.5) out = visualizer.draw_dataset_dict(d) cv2_imshow(out.get_image()[:, :, ::-1])
  • 87. Hichem Felouat - Algeria - hichemfel@gmail.com 87 Instance Segmentation on a custom dataset from detectron2.engine import DefaultTrainer from detectron2.config import get_cfg import os # mask_rcnn model_link = "COCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x.yaml" cfg = get_cfg() cfg.merge_from_file(model_zoo.get_config_file(model_link)) cfg.DATASETS.TRAIN = ("my_dataset_train",) cfg.DATASETS.TEST = () cfg.DATALOADER.NUM_WORKERS = 2 cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(model_link) cfg.SOLVER.IMS_PER_BATCH = 2 cfg.SOLVER.BASE_LR = 0.00025 cfg.SOLVER.MAX_ITER = 1500 cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128 cfg.MODEL.ROI_HEADS.NUM_CLASSES = 4 os.makedirs(cfg.OUTPUT_DIR, exist_ok=True) trainer = DefaultTrainer(cfg) trainer.resume_or_load(resume=False) trainer.train()
  • 88. Hichem Felouat - Algeria - hichemfel@gmail.com 88 Instance Segmentation on a custom dataset # Inference & evaluation using the trained model # cfg already contains everything we've set previously. Now we changed it a little bit for inference: cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth") cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 # set a custom testing threshold cfg.DATASETS.TEST = ("my_dataset_test", ) predictor = DefaultPredictor(cfg) from detectron2.utils.visualizer import ColorMode dataset_dicts = get_my_dataset_dicts("/content/my_dataset/Microcontroller Segmentation/test") for d in random.sample(dataset_dicts, 5): img = cv2.imread(d["file_name"]) outputs = predictor(img) v = Visualizer(img[:, :, ::-1], metadata=my_dataset_metadata, scale=0.5, instance_mode=ColorMode.IMAGE_BW ) out = v.draw_instance_predictions(outputs["instances"].to("cpu")) cv2_imshow(out.get_image()[:, :, ::-1]) print("n")
  • 89. Hichem Felouat - Algeria - hichemfel@gmail.com 89 TensorFlow Object Detection API • The TensorFlow Object Detection API is an open- source framework built on top of TensorFlow that makes it easy to construct, train, and deploy object detection models. • The TensorFlow Object Detection API allows you to train a collection state of the art object detection models under a unified framework. TensorFlow Object Detection API : https://github.com/tensorflow/models/tree/master/research/object_detection
  • 90. Hichem Felouat - Algeria - hichemfel@gmail.com 90 TensorFlow Object Detection API - Model Zoo Model Zoo : https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md
  • 91. Hichem Felouat - Algeria - hichemfel@gmail.com 91 TensorFlow Object Detection API - Inference import os import pathlib # Clone the tensorflow models repository if it doesn't already exist if "models" in pathlib.Path.cwd().parts: while "models" in pathlib.Path.cwd().parts: os.chdir("..") elif not pathlib.Path("models").exists(): !git clone --depth 1 https://github.com/tensorflow/models # Install the Object Detection API %%bash cd models/research/ protoc object_detection/protos/*.proto --python_out=. cp object_detection/packages/tf2/setup.py . python -m pip install . # Run model builder test !python /content/models/research/object_detection/builders/model_builder_tf2_test.py
  • 92. Hichem Felouat - Algeria - hichemfel@gmail.com 92 TensorFlow Object Detection API - Inference Download the model : More models can be found in the TensorFlow 2 Detection Model Zoo. To use a different model you will need the URL name of the specific model. This can be done as follows:
  • 93. Hichem Felouat - Algeria - hichemfel@gmail.com 93 TensorFlow Object Detection API - Inference # Download and extract model def download model(model_name, model_date): base_url = "http://download.tensorflow.org/models/object_detection/tf2/" model_file = model_name + ".tar.gz" model_dir = tf.keras.utils.get_file(fname=model_name, origin=base_url + model_date + "/" + model_file, untar=True) return str(model_dir) # Faster R-CNN Inception ResNet V2 1024x1024 : # http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_inception_resnet_v2_1024x1024_coco17_tpu- 8.tar.gz # EfficientDet D7 1536x1536 # http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d7_coco17_tpu-32.tar.gz # Mask R-CNN Inception ResNet V2 1024x1024 # http://download.tensorflow.org/models/object_detection/tf2/20200711/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu- 8.tar.gz MODEL_DATE = "20200711" MODEL_NAME = "mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8" PATH_TO_MODEL_DIR = download model(MODEL_NAME, MODEL_DATE)
  • 94. Hichem Felouat - Algeria - hichemfel@gmail.com 94 # Download labels file """ Since the pre-trained model we will use has been trained on the COCO dataset, we will need to download the labels file corresponding to this dataset, named mscoco label map.pbtxt. """ def download labels(filename): base_url = 'https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/data/' label_dir = tf.keras.utils.get_file(fname=filename, origin=base_url + filename, untar=False) label_dir = pathlib.Path(label_dir) return str(label_dir) LABEL_FILENAME = "mscoco_label_map.pbtxt" PATH_TO_LABELS = download labels(LABEL_FILENAME) TensorFlow Object Detection API - Inference
  • 95. Hichem Felouat - Algeria - hichemfel@gmail.com 95 TensorFlow Object Detection API - Inference # Load The Downloaded Model import time from object_detection.utils import label_map_util from object_detection.utils import visualization_utils as viz_utils PATH_TO_SAVED_MODEL = PATH_TO_MODEL_DIR + "/saved_model" print("Loading model...", end="") start_time = time.time() # Load saved model and build the detection model my_detection_model = tf.saved_model.load(PATH_TO_SAVED_MODEL) end_time = time.time() elapsed_time = end_time - start_time print("Done! Took {} seconds".format(elapsed_time)) # ****************************************************************************** # Load label map data (for plotting) category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)
  • 96. Hichem Felouat - Algeria - hichemfel@gmail.com 96 TensorFlow Object Detection API - Inference def load_image_into_numpy_array(path): """ Load an image from file into a numpy array. Puts image into numpy array to feed into tensorflow graph. Note that by convention we put it into a numpy array with shape (height, width, channels), where channels=3 for RGB. Args: path: the file path to the image Returns: uint8 numpy array with shape (img_height, img_width, 3) """ return np.array(Image.open(path))
  • 97. Hichem Felouat - Algeria - hichemfel@gmail.com 97 TensorFlow Object Detection API - Inference def run_inference_for_single_image(image_path): print("Running inference for : ",image_path) image_np = load_image_into_numpy_array(image_path) input_tensor = tf.convert_to_tensor(image_np) input_tensor = input_tensor[tf.newaxis, ...] detections = my_detection_model(input_tensor) num_detections = int(detections.pop("num_detections")) detections = {key: value[0, :num_detections].numpy() for key, value in detections.items()} detections["num_detections"] = num_detections detections["detection_classes"] = detections["detection_classes"].astype(np.int64) image_np_with_detections = image_np.copy() # Visualizing the results viz_utils.visualize_boxes_and_labels_on_image_array( image_np_with_detections, detections["detection_boxes"], detections["detection_classes"], detections["detection_scores"], category_index, use_normalized_coordinates=True, max_boxes_to_draw=200, min_score_thresh=.30, agnostic_mode=False) cv2_imshow(image_np_with_detections) TensorFlow 2 Object Detection API tutorial: https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/ TF2 ObjectDetectionAPI Inference.ipynb https://colab.research.google.com/drive/1ossp9IRFgjIQ1-PbWqaChcgMfjDPGDyL?usp=sharing
  • 98. Hichem Felouat - Algeria - hichemfel@gmail.com 98 TensorFlow Object Detection API - Inference # Mask-RCNN def run_inference_for_single_image(image_path): print("Running inference for : ",image_path) image_np = load_image_into_numpy_array(image_path) # The input needs to be a tensor, convert it using `tf.convert_to_tensor`. input_tensor = tf.convert_to_tensor(image_np) # The model expects a batch of images, so add an axis with `tf.newaxis`. input_tensor = input_tensor[tf.newaxis, ...] # input_tensor = np.expand_dims(image_np, 0) detections = my_detection_model(input_tensor) # All outputs are batches tensors. # Convert to numpy arrays, and take index [0] to remove the batch dimension. # We're only interested in the first num_detections. num_detections = int(detections.pop("num_detections")) import itertools detections = dict(itertools.islice(detections.items(), num_detections)) detections["num_detections"] = num_detections image_np_with_detections = image_np.copy() # Handle models with masks: if "detection_masks" in detections: # Reframe the the bbox mask to the image size. detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks( detections["detection_masks"][0], detections["detection_boxes"][0], image_np.shape[0], image_np.shape[1]) detection_masks_reframed = tf.cast(detection_masks_reframed > 0.5, tf.uint8) detections["detection_masks_reframed"] = detection_masks_reframed.numpy() boxes = np.asarray(detections["detection_boxes"][0]) classes = np.asarray(detections["detection_classes"][0]).astype(np.int64) scores = np.asarray(detections["detection_scores"][0]) mask = np.asarray(detections["detection_masks_reframed"]) # Visualizing the results viz_utils.visualize_boxes_and_labels_on_image_array(image_np_with_detections, boxes, classes, scores, category_index, instance_masks=mask, use_normalized_coordinates=True, line_thickness=3) cv2_imshow(image_np_with_detections) display(Image.fromarray(image_np_with_detections)) print("Done")TF2 ObjectDetectionAPI Inference.ipynb https://colab.research.google.com/drive/1ossp9IRFgjIQ1-PbWqaChcgMfjDPGDyL?usp=sharing
  • 99. Hichem Felouat - Algeria - hichemfel@gmail.com 99 Train TF OD API on a custom dataset • The TF2 object detection API models expect the input dataset in TFRecord format. • The TFRecord format is a simple format for storing a sequence of binary records. • How to Create a TFRecord File for Computer Vision and Object Detection? follow: https://blog.roboflow.com/how-to-create-to-a-tfrecord-file-for-computer-vision/
  • 100. Hichem Felouat - Algeria - hichemfel@gmail.com 100 Train TF OD API on a custom dataset # Downloading data from Roboflow # UPDATE THIS LINK - get our data from Roboflow %mkdir /content/my_dataset/ %cd /content/my_dataset/ !curl -L "[YOUR LINK HERE]" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip • After annotating the dataset, we can convert it form (xml, csv, or any other format) to TFRecord format.
  • 101. Hichem Felouat - Algeria - hichemfel@gmail.com 101 Train TF OD API on a custom dataset # change chosen model to deploy different models available in the TF2 object detection zoo # pretrained checkpoint : # https://github.com/tensorflow/models/blob/master/research/object detection/g3doc/tf2 detection zoo.md # base pipeline file: # https://github.com/tensorflow/models/tree/master/research/object detection/configs/tf2 MODELS_CONFIG = { 'efficientdet_d0': { 'model_name': 'efficientdet_d0_coco17_tpu-32', 'base_pipeline_file': 'ssd_efficientdet_d0_512x512_coco17_tpu-8.config', 'pretrained_checkpoint': 'efficientdet_d0_coco17_tpu-32.tar.gz', 'batch_size': 8 }, 'efficientdet_d7': { 'model_name': 'efficientdet_d7_coco17_tpu-32', 'base_pipeline_file': 'ssd_efficientdet_d7_1536x1536_coco17_tpu-32.config', 'pretrained_checkpoint': "efficientdet_d7_coco17_tpu-32.tar.gz", 'batch_size': 4 }, 'Faster_RCNN': { 'model_name': 'faster_rcnn_resnet50_v1_640x640_coco17_tpu-8', 'base_pipeline_file': 'faster_rcnn_resnet50_v1_640x640_coco17_tpu-8.config', 'pretrained_checkpoint': 'faster_rcnn_resnet50_v1_640x640_coco17_tpu-8.tar.gz', 'batch_size': 4 } }
  • 102. Hichem Felouat - Algeria - hichemfel@gmail.com 102 Train TF OD API on a custom dataset chosen_model = "Faster_RCNN" num_steps = 2000 # The more steps, the longer the training. Increase if your loss function is still decreasing and validation metrics are increasing. num_eval_steps = 100 # Perform evaluation after so many steps model_name = MODELS_CONFIG[chosen_model]['model_name'] pretrained_checkpoint = MODELS_CONFIG[chosen_model]['pretrained_checkpoint'] base_pipeline_file = MODELS_CONFIG[chosen_model]['base_pipeline_file'] batch_size = MODELS_CONFIG[chosen_model]['batch_size']
  • 103. Hichem Felouat - Algeria - hichemfel@gmail.com 103 Train TF OD API on a custom dataset # Download pretrained weights %mkdir /content/models/research/deploy/ %cd /content/models/research/deploy/ import tarfile download_tar = 'http://download.tensorflow.org/models/object_detection/tf2/20200711/' + pretrained_checkpoint !wget {download_tar} tar = tarfile.open(pretrained_checkpoint) tar.extractall() tar.close() # Download base training configuration file %cd /content/models/research/deploy download_config = 'https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/configs/tf2/' + base_pipeline_file !wget {download_config}
  • 104. Hichem Felouat - Algeria - hichemfel@gmail.com 104 Train TF OD API on a custom dataset !python /content/models/research/object_detection/model_main_tf2.py --pipeline_config_path={pipeline_file} --model_dir={model_dir} --alsologtostderr --num_train_steps={num_steps} --sample_1_of_n_eval_examples=1 --num_eval_steps={num_eval_steps} Training Custom Object Detector: https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html Train Custom TensorFlow2 Object Detection Model: https://colab.research.google.com/drive/1sLqFKVV94wm-lglFq_0kGo2ciM0kecWD#scrollTo=fF8ysCfYKgTP&uniqifier=1
  • 105. Hichem Felouat - Algeria - hichemfel@gmail.com 105 TF OD API - Mask-RCNN on a custom dataset Dataset Gathering : As always, at the beginning of an AI Project after the problem statement has been identified we move on to gathering the data or in this case images for training. Data Labeling : Now that we have gathered the dataset we need to label the images so that the model understands what is the interesting object on the picture. To label the images we need a labeling software, for example : labelme (instance segmentation)
  • 106. Hichem Felouat - Algeria - hichemfel@gmail.com 106 TF OD API - Mask-RCNN on a custom dataset labelme
  • 107. Hichem Felouat - Algeria - hichemfel@gmail.com 107 TF OD API - Mask-RCNN on a custom dataset Create TFRecords : • After labeling our entire dataset we now have to generate TFRecords which serves as input for our model training. labelme offer us json files, so that we need to convert the json labelme labels into COCO format. • You can use labelme2coco package to convert labelme annotations to COCO format. https://github.com/fcakyon/labelme2coco • Now that the data is in COCO format we can create the TFRecord files. For this, we will make use of the create coco tf record.py file.
  • 108. Hichem Felouat - Algeria - hichemfel@gmail.com 108 TF OD API - Mask-RCNN on a custom dataset Follow the full code of Mask RCNN in my Colab Notebook: https://colab.research.google.com/drive/1mBw-dLLyM96N2KePnh9fE6Wwm4RWw9H9?usp=sharing
  • 109. Hichem Felouat - Algeria - hichemfel@gmail.com 109 Thanks For Your Attention Hichem Felouat ...