10. Outline
• Images
– RecogniDon
辨識
– ReconstrucDon重建
• Videos
– Surveillance
監視
– SummarizaDon
摘要
Images
• Virtual
Data
Ø 3D
CAD
models
Ø 3D
Environment
11. • 開始於 2007 @ Princeton
• 初登場於 2009 @ CVPR
• 照⽚片停⽌止搜集於 2010
Ø 總共類別:21841
Ø 總共圖⽚片:1千4百萬
• ILSVR Challenge 從2010到現今
Jia
Deng
Fei-‐Fei
Li
Info
from
h3p://www.image-‐net.org/
12. 1K
Image
ClassificaDon
Figure
from
Olga
Russakovsky
ECCV'14
workshop
Deep Learning
深度學習
13. Place-‐Net
• 2014 @ MIT and Princeton
Ø 總共類別:400
Ø 總共圖⽚片:7百萬
#
of
images
Bolei
Zhou
Prof.
Torralba
21841
1千4百萬
22%
classificaDon
error
Info
from
h3p://places.csail.mit.edu/
14. Deep
Learning:
ConvoluDonal
Neural
Network
(CNN)
HandwriZen
Character
A
filter
response
filter
response
#filters
24. AlexNet,
2012,
Krizhevsky
et
al
2
ConvoluDon
layers
5
ConvoluDon
layers
u Many Pre-trained Networks
o Caffe http://caffe.berkeleyvision.org/
o Torch http://torch.ch/
30. Structure
from
MoDon
(SfM)
h3p://homes.cs.washington.edu/~shanqi/
3D 瀏覽
Large-‐scale
StaDc
Scene
照⽚片
Dense Reconstruction
Visual Turing Test
h3p://phototour.cs.washington.edu/
58. Summary
• Type
of
Big
visual
data
– Images
– Videos
– Virtual
data
• Applica]ons
– Search
and
organize:
Google
photo
– Browse
and
visualize:
Ma3erport,
photo
tour
– Visual
Commerce:
Di3o.com,
visiosafe
– Video
summary,
security,
self-‐driving
car,
etc.
59. Vision
Science
Lab@
NTHU
Taiwan
PI:
Min
Sun
Web:
aliensunmin.github.io
Office:
Delta
962
Lab:
EECS
Bldg
712
Tel:
+886-‐3-‐5731058
Email:
sunmin@ee.nthu.edu.tw
Goal:
making
great
impact
in
computer
vision,
robot
vision,
mobile
vision,
etc.
We
aim
to
build
game
changing
applica]ons
that
improve
our
daily
life.
Analyzing
Street
Views
Understanding
Personal
Videos
3D
Robot
Vision
Human
Sensing
Research
Topics
Wearable
Camera
ApplicaDons
Make3D
62