8. 組込みシステムでの要求事項
8
Cloud Embedded
Many classes (1000s) Few classes (<10)
Large workloads Frame rates (15‐30 FPS)
High efficiency
(Performance/W)
Low cost & low power
(1W‐5W)
Server form factor Custom form factor
J. Freeman (Intel), “FPGA Acceleration in the era of high level design”, HEART2017
11. 物体検出タスク
• 複数の物体に対してクラス分類+位置検出を同時に⾏う
• 評価⽅法 (from Pascal VOC):
11
Ground truth
annotation
Detection results:
>50% overlap of
bounding box(BBox)
with ground truth
One BBox for each
object
Confidence value
for each object
Person (50%)
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛
# 𝐶𝑜𝑟𝑟𝑒𝑐𝑡 𝑑𝑒𝑡𝑒𝑐𝑡.
#𝑎𝑙𝑙 𝑑𝑒𝑡𝑒𝑐𝑡.
𝑟𝑒𝑐𝑎𝑙𝑙
# 𝐶𝑜𝑟𝑟𝑒𝑐𝑡 𝑑𝑒𝑡𝑒𝑐𝑡.
#𝑎𝑙𝑙 𝑜𝑏𝑗𝑒𝑐𝑡𝑠
𝐴𝑃
1
11
𝑃 , ∈ ,. ,…,
Average Precision (AP):
19. 2値化CNNのニューロン数
(特徴マップ数)と認識精度の関係
19
Source: “FINN: A Framework for Fast, Scalable Binarized Neural Network Inference,”
Yaman Umuroglu1,2, Nicholas J. Fraser1,3, Giulio Gambardella1, Michaela Blott1, Philip Leong3, Magnus Jahre2, Kees Vissers1
20. 混合精度CNN
• Object Detectorで必須技術
• 前段: 2値精度CNN … ⾯積・スピードを稼ぐ
• 後段: 多値精度CNN … 回帰問題(枠推定)を解く
20
Input
Image
(Frame)
Feature maps
CONV+Pooling
CNN
CONV+Pooling
Class score
Bounding Box
Detection
2値 half
H. Nakahara et al., “A Lightweight YOLOv2: A Binarized CNN with A Parallel Support Vector Regression for an
FPGA,” Int’l Symp. on FPGA (ISFPGA), 2018.