21. ニューラルネットワークによる処理, SSII2020
Biological Vision
“Retina is sensitive to temporal brightness gradients”
“Retina is blind to static scenes in absence of eye movements ”
22
Receptive fields of single neurons in the cat’s striate cortex
David H Hubel et.al.,1959, Nobel prize 1981
23. ニューラルネットワークによる処理, SSII2020
History
24
2010
・ 1991 Mahowald et.al
1990 20202000
・2020 Gen.4
1280 x 720 w/ SONY
・2018 Celexl V
1280 x 960
Y.Sekikawa, イベントカメラの研究動向と,
2015-
2014-
・2020 DVXplorer
640 x 480
2012-
・2018 Samsung Gen.3
Event-Based Sencing Device
2017- ・2020 GaAI One
・2018 DyNap CNN2017-
・2014 IBM TrueNorth
・2018 Intel Loihi
・2018 Stanford Braindrop
Event-Based Processing Device
・2009 Lichtsteiner et.al
128x128
24. ニューラルネットワークによる処理, SSII2020
Comparison between different event cameras
25
Prophesee(Chronocam) iniVation(iniLabs) Samsung Celepixel(Hillhouse)
Latest version ATIS-Gen4 DAVIS346 DVS Gen.4 CeleX-V
Resolution CD : 1280 x 720 CD+EM : ? 346x260 1280 x 960 1280 x 800
Pixel pitch CD : 4.86μm CD+EM : ? 18.5μm 4.95μm 9.8μm
Intensity information
EM: Exposure Measurement
130dB
Event resets a capacitor to a
high voltage. Brighter →faster
discharges
APS: Active pixel sensor
56.7dB
Similar to standard frame
N/A
Intensity at event-rate
Other feature
/ Info
IMARGO industrial Camera
Joint dev. with SONY
Sony acquires Insightness
DVXplorer
(640 x 480 no intensity)
In-home monitoring camera
Event-wise optical flow
Commercial product for
DSM in China
Y.Sekikawa, イベントカメラの研究動向と,
31. ニューラルネットワークによる処理, SSII2020
Event generation Model
§Each pixel asynchronously report intensity changes
Image from Kim et.al., et.al, Simultaneous mosaicing and tracking with an event camera
32Y.Sekikawa, イベントカメラの研究動向と,
33. 34
Δ" = $!% ≈ −("/(* ⋅ ,Δ-
v Temporal relation
v Spatial relation (optical flow constraint)
) #$, %$ − ) #$, %$%& ≈ ,
$∈(
('$
Note: No event when image gradient is perpecdicular to motion −("/(*
,
Time
LogIntensity
!
""""#$
34
34. ニューラルネットワークによる処理, SSII2020
Event-Based Camera
35
• High speed (1µs)
• Low data rate/Sparse (0-30Mbps)
• No motion blur
• High dynamic range (130dB)
✔
✔
Fram
e-based
cam
era
Speed[fps]
EnergyComsumption[W]
Event-based camera
Datarate[bps]
Price[$]
✔
Y.Sekikawa, イベントカメラの研究動向と,
35. ニューラルネットワークによる処理, SSII2020
Difficulties when dealing with event data
§Sparse data representation: Frame-based alg. cannot be applied
Image from Gehrig et.al, Asynchronous, Photometric Feature Tracking using Events and Frames
36Y.Sekikawa, イベントカメラの研究動向と,
§Motion dependent data: Association in SLAM / Generalization in ML
Frame:
Motion Independent
Event (Histogram):
Motion Dependent
36. ニューラルネットワークによる処理, SSII2020
Wide Range of Usage
Algorithm
§Tracking
§Optical Flow
§Visual odometry
§SLAM
§Image Reconstruction
§Stereo depth estimation
§3D measurement with SL
§Object Recognition
§Etc..
37
Applications
§Surveillance at Home
§Obstacle avoidance
§UAV, automotive
§Bin-picking
§Gesture recognition
§Etc..
Y.Sekikawa, イベントカメラの研究動向と,
38. ニューラルネットワークによる処理, SSII2020
ML-Based (End-to-end neural-network for event processing)
39Y.Sekikawa, イベントカメラの研究動向と,
Algorithm
• Feature Tracking
• Optical Flow (OF)
• Visual Odometry (VO)
• Simultaneous Localization and Mapping (SLAM)
• 3D Reconstruction
• Intensity Reconstruction (IR)
Model-based Processing
Setup
Image from Gallego et.al., Event-based, 6-DOF Camera Tracking from Photometric Depth Maps
Geometry
Planer/Known/3D? Texture
Known/Estimate?
Known/Estimate?
Rotation/SE(2)/SE(3)?
Environment
Static/Dynamic
Use image as
Proxy/Direct?
Ext. sensor
/Reconstruct ?
Algorithm
Input
39. ニューラルネットワークによる処理, SSII2020
Speed Invariant Time Surface
for Learning to Detect Corner Points with Event-Based Cameras
Manderscheid et.al., CVPR2019
Very fast/robust corner tracking in challenging illumination conditions
Corner detetion using SI time surface
Event to Time Surface
SI:Speed Invariant
40
Corner tracking
Event only
Y.Sekikawa, イベントカメラの研究動向と,
Image (on the bottom) from Alzugaray, et.al., Asynchronous Corner Detection and Tracking for Event Cameras in Real Time
Video from Alzugaray et.al., Asynchronous Corner Detection and Tracking for Event Cameras
Tracking by simple Nearest Neighbor association
40. ニューラルネットワークによる処理, SSII2020
EKLT: Asynchronous Photometric Feature Tracking Using Events and Frames
Gehrig et.al., Depts. Informatics and Neuro informatics@ETH, ECCV2018 IJCV2019
159
! = arg min
!
( − )
Very fast/rubust feature tracking in challenging illumination conditions
Compare intensity increment from event with prediction from frame
41
Feature tracking
Event + Frame
Y.Sekikawa, イベントカメラの研究動向と,
42. ニューラルネットワークによる処理, SSII2020
Simultaneous Optical Flow and Intensity Estimation from an Event Camera
Bardow et.al., Imperial College Dyson Lab., CVPR2016
43
32
Data term
"#
Optical flow
Intensity & OF estimation at high rate in challenging illumination conditions
Joint optimization using optical flow constrains
IR+OF
Event
Y.Sekikawa, イベントカメラの研究動向と,
43. ニューラルネットワークによる処理, SSII2020
Continuous-time Intensity Estimation Using Event Cameras
Scheerlinck et.al., ACCV2018
37
Real-time & high rate intensity estimation in challenging inllumination conditions
Complementary fusion of frame and event
44
IR
Event+Frame
Y.Sekikawa, イベントカメラの研究動向と,
45. ニューラルネットワークによる処理, SSII2020
Simultaneous Mosaicing and Tracking with an Event Camera
§Kim et.al, Imperial College London, BMVC2014
46
Localization (PF)
39
Mapping (EKF based IR)
SLAM (rotation only) in challenging illumination conditions
Mapping by intensity reconstructing (Pioneering work for event-based SLAM)
Intencity MAP
SLAM(SO(3))
Event
Y.Sekikawa, イベントカメラの研究動向と,
47. ニューラルネットワークによる処理, SSII2020
Real-Time 3D Reconstruction and 6-DoF Tracking with an Event Camera
Kim et.al, Imperial College London, ECCV2016 (Best Paper)
48
47
Full 6DOF SLAM in challenging illumination conditions
Extension of Kim2016 to SE(3) by incorporation depth estimation
SLAM(SO(3))
Event
Y.Sekikawa, イベントカメラの研究動向と,
48. ニューラルネットワークによる処理, SSII2020
Focus Is All You Need: Loss Functions For Event-based Vision
Guillermo et.al., UZH@ETH, CVPR2018, CVPP2019
49
120
Efficient motion (OF) estimation in challenging illumination conditions
OF estimation w/o intensity. Novel focus-based loss
OF
Event
Y.Sekikawa, イベントカメラの研究動向と,
53. ニューラルネットワークによる処理, SSII2020
EVO: A Geometric Approach to Event-Based 6-DOF Parallel Tracking
and Mapping in Real-time
Rebecq et.al., UZH@ETH, RAL2016
54
48
Very fast SLAM (500Hz on CPU) in challenging illumination conditions
Utilize DSI (No IR, edge-map alignment suffice)
SLAM(SE(3))
Event
Y.Sekikawa, イベントカメラの研究動向と,
55. ニューラルネットワークによる処理, SSII2020
Ultimate SLAM? combining events, images, and IMU
for robust visual SLAM in HDR and high speed scenarios
Vidal et.al., RPG@ETH, ROBOTICS AND AUTOMATION LETTERS 2017
56
50
Efficient SLAM in challenging illumination conditions
Utilize all available sensors for computational efficiently and robustness
SLAM(SE(3))
Event+Frame+Gyro
Y.Sekikawa, イベントカメラの研究動向と,
62. ニューラルネットワークによる処理, SSII2020
ML-Based (End-to-end neural-network for event processing)
63
Event camera
Pros: Can be utilize exiting architecture !! (e.g., CNN)
Cons: Inefficient, Slow
Frame-based processing
!! ≔ x, y, p, t "
fD
(g(e)) = y
Event-based processing
Pros: Efficient (Process only for changed pixels)
Cons: No established method like CNN
Event to Frame "(⋅)
fS
(e) = y
63. ニューラルネットワークによる処理, SSII2020
ML-Based (End-to-end neural-network for event processing)
64
Event camera
Pros: Can be utilize exiting architecture !! (e.g., CNN)
Cons: Inefficient, Slow
Frame-based processing
!! ≔ x, y, p, t "
fD
(g(e)) = y
Event-based processing
Pros: Efficient (Process only for changed pixels)
Cons: No established method like CNN
Event to Frame "(⋅)
fS
(e) = y
64. ニューラルネットワークによる処理, SSII2020
Event-based Vision meets Deep Learning
on Steering Prediction for Self-driving Cars
Maqueda et.al., Dept. of Informatics and Neuroinformatics@ETH, CVPR2018
65
126
Sophisticated CNN can be used
Convert sparse events to dense frame
66. ニューラルネットワークによる処理, SSII2020
Learning an event sequence embedding for dense event-based deep stereo
Tulyakov et.al., EPLF, ICCV2019
67
Event camera Frame-based processing
!! ≔ x, y, p, t "
fD
(g(e)) = y
Event to Frame "(⋅) &
Better than hand crafted conversion
Learn to convert (temporal kernel ) sparse events to dense frame
'&
72. ニューラルネットワークによる処理, SSII2020
High Speed and High Dynamic Range Video with an Event Camera
Rebecq et.al., UZH@ETH, CVPR 2019 PAMI 2019
73
Existing alg. can be readily applicable for challenging applications
Leaning to convert sparse event to intensity frame
74. ニューラルネットワークによる処理, SSII2020
ML-Based (End-to-end neural-network for event processing)
75
Event camera
Pros: Can be utilize exiting architecture !! (e.g., CNN)
Cons: Inefficient, Slow
Frame-based processing
!! ≔ x, y, p, t "
fD
(g(e)) = y
Event-based processing
Pros: Efficient (Process only for changed pixels)
Cons: No established method like CNN
Event to Frame "(⋅)
fS
(e) = y
75. ニューラルネットワークによる処理, SSII2020
ML-Based (End-to-end neural-network for event processing)
76
Event camera
Pros: Can be utilize exiting architecture !! (e.g., CNN)
Cons: Inefficient, Slow
Frame-based processing
!! ≔ x, y, p, t "
fD
(g(e)) = y
Event-based processing (Spike)
Pros: Efficient (Process only for changed pixels)
Cons: Requires Neuromorphic H/W (less neurons), Difficult to train
Event to Frame "(⋅)
Event-based processing (Continuous)
fS
(e) = y
76. ニューラルネットワークによる処理, SSII2020
SNN (Spikingk Neural Network):
3rd generation of neural network: Spiking Neural Network
77
Leaky and Integrate and Fire (LIF)
Charge → Fire(10ms)→ Refractory(100ms)→
"
#
Activation: Non-differentiable Spike (ANN: Relu, Sigmoid)
Asynchronous: MP※ > threshold → Fire (Similar to Event Camera)
MP%(')
Non-differentiable
*+
*,!
=
*+
*,"
⋅ …
*,#
*,!
Chain Rule
※MP: membrane potential
77. ニューラルネットワークによる処理, SSII2020
SNN Hardware
TrueNorth DYNAP Loihi Braindrop
Manufacture IBM aiCTX Intel Stanford
Type of neurons Digital LIF Analog LIF Digital LIF Analog
Neurons per chip 1,0000,000 4096x4 130,000x8 4096
Year 2014 2017 2018 2018
Programing Corelet, Eedn libcaer /cAER in C/C++ Nengo/Brain/PyNN Nengo
Training Outside chip On chip On chip Outside chip
For more detailed review see Young et.al. A Review of Spiking Neuromorphic Hardware Communication Systems, IEEE Access 2019
78
78. ニューラルネットワークによる処理, SSII2020
Categorization of training SNN
Supervised
Rewarded-STDP ANN (Back-propagation) to SNN
Unsupervised
STDP
Back-propagation
• Continuous relaxation (Approximate gradient / Inefficient)
• Temporal Coding (Exact / Dead neuron)
• Random back propagation
79
79. ニューラルネットワークによる処理, SSII2020
Synaptic Modifications in Cultured Hippocampal Neurons:
Dependence on Spike Timing, Synaptic Strength, and Postsynaptic Cell Type
Bi et.al., Journal of Neuroscience, 1998
Images from arc-instruments
Weight update
Simple unsupervised training for non-differentiable spike. Neuroplausible
Hebb rule: “who fire together, wire together”
80
80. 81
Image Diehl et.al, l. Unsupervised learning of digit recognition using spike-timing-dependent plasticity
Simple mapping yealds 95% accuracy
81
81. ニューラルネットワークによる処理, SSII2020
A Low Power, Fully Event-Based Gesture Recognition System
Amir et.al., IBM Research+UZH-ETH, CVPR2017
Realized efficient gesture recognition using real SNN H/W (TrueNorth)
Convert trained ANN to SNN
82
83. ニューラルネットワークによる処理, SSII2020
Training Deep Spiking Neural Networks Using Backpropagation
Lee et.al, Institute of Neuroinformatics@ETH, Frontiers in Neuroscience 2016
84
Events by emulation saccade
"
# 0
9
Non-differentiable
0
9
Error
Est Ref
44
E2E training of SNN
Approximate non-differentiable spike using differentiable low-passed spike
85. ニューラルネットワークによる処理, SSII2020
Random synaptic feedback weights support error backpropagation
for deep learning
Lillicrap et.al., Univ.Oxford, Nature2016
Symmetric Backpropagation
(Chain Rule on ANN)
Random- Backpropagation
Direct feedback
For more detail see http://www.cs.toronto.edu/~tingwuwang/2546.pdf
Asymmetric Backpropagation
(Mammal neuron)
Neuroplausible training
DNN can be trained using Random matrix $ instead of symmetric weight
Direct error feedback
86
86. ニューラルネットワークによる処理, SSII2020
Event-Driven Random Back-Propagation:
Enabling Neuromorphic Deep Learning Machines
Neftci et.al., Univ.California+Intel, Frontiers in Neuroscience 2016
Weight update
Enables on-chip training & layer-by-layer parallelization
Apply RBP to SNN.
87
87. ニューラルネットワークによる処理, SSII2020
ML-Based (End-to-end neural-network for event processing)
Event camera
Pros: Can be utilize exiting architecture !! (e.g., CNN)
Cons: Inefficient, Slow
Frame-based processing
!! ≔ x, y, p, t "
fD
(g(e)) = y
Event-based processing (Spike)
Pros: Efficient (Process only for changed pixels)
Cons: Requires Neuromorphic H/W (less neurons), Difficult to train
Event to Frame "(⋅)
Event-based processing (Continuous)
88
fS
(e) = y
88. ニューラルネットワークによる処理, SSII2020
EventNet: Asynchronous recursive event processing
Sekikawa et.al, CVPR 2019
89
Event camera
Pros: Can be utilize exiting architecture !! (e.g., CNN)
Cons: Inefficient, Slow
Frame-based processing
!! ≔ x, y, p, t "
fD
(g(e)) = y
Event-based processing (Continuous)
Pros: Efficient (Process only for changed pixels)
Cons: Requires Neuromorphic H/W (less neurons), Difficult to train
Event to Frame "(⋅)
Event-based processing (Spike)
133
Real-time event-wise inference on CPU
Recursive formulation & LUT to drastically reduce computational complexity
fS
(e) = y