SlideShare a Scribd company logo
1 of 146
Download to read offline
Putting the latest Computer Vision and
Deep Learning algorithms to work
The Opportunities and Challenges
Albert Y. C. Chen, Ph.D.
Vice President, R&D
Viscovery
Albert Y. C. Chen, Ph.D.
• Experience
2017-present: Vice President of R&D at Viscovery
2016-2017: Chief Scientist at Viscovery
2015: Principal Scientist @ Nervve Technologies
2013-2014 Computer Vision Scientist @ Tandent Vision
2011-2012 @ GE Global Research
• Education
Ph.D. in Computer Science, SUNY-Buffalo
M.S. in Computer Science, NTNU
B.S. in Computer Science, NTHU
• Some random things about me…
SUNY Excellence in Teaching Award, 2010.
Some rapid promotions, some failed startups, some
patents, some papers…
1. W.Wu,A.Y. C. Chen, L. Zhao, and J. J. Corso. Brain tumor detection and segmentation in a CRF framework with pixel-wise
affinity and superpixel-level features. International Journal of Computer Assisted Radiology and Surgery, 2015.
2. S. N. Lim,A.Y. C. Chen and X.Yang. Parameter Inference Engine (PIE) on the Pareto Front. In Proceedings of International
Conference of Machine Learning,Auto ML Workshop, 2014.
3. A.Y. C. Chen, S.Whitt, C. Xu, and J. J. Corso. Hierarchical supervoxel fusion for robust pixel label propagation in videos. In
Submission to ACM Multimedia, 2013.
4. A.Y.C. Chen and J.J. Corso.Temporally consistent multi-class video-object segmentation with the video graph-shifts
algorithm. In Proceedings of IEEE Workshop on Applications of ComputerVision, 2011.
5. D.R. Schlegel,A.Y.C. Chen, C. Xiong, J.A. Delmerico, and J.J. Corso. Airtouch: Interacting with computer systems at a
distance. In Proceedings of IEEE Workshop on Applications of ComputerVision, 2011.
6. A.Y.C. Chen and J.J. Corso. On the effects of normalization in adaptive MRF Hierarchies. In Proceedings of International
Symposium CompIMAGE, 2010.
7. A.Y.C. Chen and J.J. Corso. Propagating multi-class pixel labels throughout video frames. In Proceedings of IEEE Western
NewYork Image Processing Workshop, 2010.
8. A.Y. C. Chen and J. J. Corso. On the effects of normalization in adaptive MRF Hierarchies. Computational Modeling of
Objects Represented in Images, pages 275–286, 2010.
9. Y.Tao, L. Lu, M. Dewan,A.Y. C. Chen, J. J. Corso, J. Xuan, M. Salganicoff, and A. Krishnan. Multi-level ground glass nodule
detection and segmentation in ct lung images. Medical Image Computing and Computer-Assisted Intervention, 2009.
10. A.Y.C. Chen, J.J. Corso, and L.Wang. Hops: Efficient region labeling using higher order proxy neighborhoods. In
Proceedings of IEEE International Conference on Pattern Recognition, 2008.
Some work done before I
caught the startup fever
Freestyle Sketching Stage
AirTouch waits in background
for the initialization signal
Initialize
Terminate
Output
image
database
Start:
Results
CBIR
query
Airtouch HCI interface for Content-based Image Retrieval
Interactive Segmentation & Classification
• Segmentation then classification:
• computationally more efficient,
• results in much higher classification accuracy.
• Pioneered the “pixel label propagation” field.
• First to utilize superpixels and supervoxels for the task.
FG
Traditional Spatial
Propagation
Pixel label map
Label a subset of pixels
BG
Spatio-temporal Propagation
time
Image/Video Object Recognition
and Content Understanding
approaches
person carries
gives
recieves
Ontology
object
Person 1
Person 1Person 2
High-Level
Mid-Level
approach
activity
receives gives
carries
activity
activity activity
Time
Reasoning
x
x
x
Low-Level
x x
x
x
Learning and Adapting Optimal
Classifier Parameters
subspace B
subspace
A
subspace
C
Image-level feature space
priors
Patch-level feature space
posterior
probability
suggest optimal
parameter configuration
Graphical Models and
Stochastic Optimization
A
(a) The space-time volume of a
video showing the objects
(A--F) and their appearing
time-span.
space
time
A
B
C
D
E
F
B E
F
C
D
(b) The temporal relationship
graph. An edge between
two vertices mean that the
two objects overlap in time.
(c) The goal is: cover all objects
with the smallest number of
"ground truth key frames".
space
time
A
B
C
D
E
F
key 1 key 2
A
B E
F
C
D
(d) This translates to: iteratively
solving the max clique
problem until all vertices
belong to a clique.
A
B E
F
C
D
key 2
key 1
frame t-1 frame t
layer n layer n
layer n+1 layer n+1
Temporal
Shift
Shift
µ
Medical Imaging and
Geospatial Imaging
GNN detection and
segmentation
in Lung CT geospatial imaging:
building detection
Brain tumor detection and
segmentation in MR images.
Why Risk to Innovate?
• Good business model NEVER last forever.
• Average “shelf life” on S&P 500: 20 years.
• 100-year old companies constantly reinvent
themselves every 10-20 years
• Startups contribute to 20% of USA’s GDP.
The Death of a Good
Business Model
• Foxconn 20 year revenue v.s. net profit (now at 5%)
What do 100 year old
corporations do?
GE Schenectady, 1896
History of change at GE
• 1886: one of the 12 original companies on the Dow
Jone Industrial Average (also the only one remaining).
• 1889: lightbulbs
• 1919: radios
• 1927: TV
• 1941: jet engine
• 1960: nuclear power
• 1971: room AC units
• 1995: MRI
History of change at IBM
• 1960s: mainframe computer
• 1980s: personal computer
• 2000s: integrated solutions
• 2020s: AI, Watson
How about the leading
Semiconductor companies?
NVidia reinventing itself
—2 times in 20 years
“Bad money drives out good”
in the desktop GPU market
The rise of mobile computing,
and how NVidia missed the boat!
NVidia’s Tegra mobile
processors never took off
then, the market
saturated…
NVidia not just survived.
NVidia is thriving!
Meet the new NVidia: Deep Learning,
Deep Learning, and still, Deep Learning
The king is dead,
long live the king!
Now, again, do we want to
do OEM/ODM forever?
Optimizing an old business model
is just delaying its eventual death.
Startups
• A company, partnership, or temporary
organization designed to search for a new,
repeatable and scalable business model.
Your Idea
• Are you passionate about it?
• Is it disruptive enough?
• What is your business plan?
• What is it?
• Can it make money?
• What is the future of the idea?
• What is your competitive advantage?
• How do you build up your entry barrier?
A minimal startup team
• A hacker
• A hustler
• A hipster
Startup Timeline
Prototype
• Hack out a prototype
• Spend 2-10 weeks max.
• Investors are much more likely to fund you if
you have a minimal initial version of your idea.
• Hackathons are a good place to start.
• Iteratively improve the prototype
Money!
Buildup your entry barrier!
• Market (users)
• Speed
• Team
• Technology
Building entry barrier with Technology!!
Computer Vision, it can’t be
that hard, right?
Brief History
Marvin Minsky
“In 1966, Minsky hired a first-year undergraduate
student and assigned him a problem to solve over the
summer: connect a television camera to a computer
and get the machine to describe what it sees.”
Gerald Sussman
The student never worked on
Computer Vision problems again.
Brief History
• 1960’s: interpretation of synthetic worlds
• 1970’s: some progress on interpreting selected images
• 1980’s: ANNs come and go; shift toward geometry and increased
mathematical rigor
• 1990’s: face recognition; statistical analysis in vogue
• 2000’s: broader recognition; large annotated datasets available; video
processing starts
Guzman ‘68 Ohta Kanade ‘78 Turk and Pentland ‘91
What’s in our arsenal?
• Image filters
• Feature descriptors
• Classifiers
Filters: blurring
Filters: sharpen
Filters: edge
Filters: straight lines
Features:
Features: Harris Corners
Features: Laplacian of Gaussian
(LoG; scale detection)
Features: Orientation
How to compute the rotation?
Create edge orientation
histogram and find peak.
Features: SIFT
Features: SIFT
Features: Gabor
Classifiers: SVM
Classifiers: Ensemble
Classifiers: Random Fields
Classifiers: Deformable Parts
Model (DPM)
Classifiers: Deep Neural Network
What alg. should I use then?
• How much data do we have?
• What objects are we trying to detect?
• For example, Google’s DNN trained with 11k images
over 20 classes in 2013 doesn’t always beat DPM.
0
0.15
0.3
0.45
0.6
aero bike bird boat bottle bus car cat chair cow
0
0.15
0.3
0.45
0.6
dog horse m-bike person plant sheep sofa table train TV
D
N
N
D
P
M
ML alg. and their Applications
• Deep
Learning
• Markovian/
Bayesian
• Feature
Matching
• Other ML
methods
Meta-Learning
• Different use
cases calls for
different ML
algorithms.
• Meta-Learning:
learning how to
learn.
• Requires plenty of
domain-specific
know-how.
Maturing Computer Vision
Applications
• Final inspection cells
• Robot guidance and
checking orientation of
components
• Packaging Inspection
• Medical vial inspection
• Food pack checks
• Verifying engineered
components[5]
• Wafer Dicing
• Reading of Serial
Numbers
• Inspection of Saw
Blades
• Inspection of Ball Grid
Arrays (BGAs)
• Surface Inspection
• Measuring of Spark
Plugs
• Molding Flash Detection
• Inspection of Punched
Sheets
• 3D Plane
Reconstruction with
Stereo
• Pose Verification of
Resistors
• Classification of Non-
Woven Fabrics
1970s-now: Machine Vision
for Industrial Inspection
• Automated Train
Examiner (ATEx)
Systems
• Automatic PCB
inspection
• Wood quality
inspection
• Final inspection of
sub-assemblies
• Engine part inspection
• Label inspection on
products
• Checking medical
devices for defects
Industrial Inspection: turbofan
jet engine blade maintenance
• Some seemingly daunting
machine vision tasks actually
works with relatively simple
image processing algorithms.
Industrial Inspection: Cognex Omniview
Industrial Inspection: Cognex Omniview
License Plate Recognition
(1979-now)
License Plate Readers with Text
Detection and Neural Networks
Biometrics
Automated Fingerprint
Identification (1970s-now)
Face Recognition
(1990s-now)
• Face Detection (Viola and Jones, 2001)
• Face Verification (1:1) v.s. Identification (1:N)
Face Verification and Identification,

Labeled Faces in the Wild (LFW)
Recognition
Accuracy:
• 1 to 1: 99%+
• 1 to 100: 90%
• 1 to 10,000:
50%-70%.
• 1 to 1M: 30%.
LFW dataset, common FN↑, FP↓
Sports—NFL first down line
(1995-now)
Sports—NFL first down line
minus
equals
3D Reconstruction
(As old as CV; became
practical since SIFT)
3D Reconstruction with Feature
Matching, Structure from Motion
3D Reconstruction with Feature
Matching, Structure from Motion
Image Panoramas
(1980s - now)
Solving Panorama Problem
with Markov Random Fields
Input:
Solving Panorama Problem
with Markov Random Fields
Input:
Solving Panorama Problem
with Markov Random Fields
Input:
Solving Panorama Problem
with Markov Random Fields
Input:
Solving Panorama Problem
with Markov Random Fields
Input:
Solving Panorama Problem
with Markov Random Fields
Input:
Solving Panorama Problem
with Markov Random Fields
Input:
Solving Panorama Problem
with Markov Random Fields
Solving Panorama Problem
with Markov Random Fields
ICM (Iterated Conditional Modes), 1986
Solving Panorama Problem
with Markov Random Fields
Belief Propagation (1980-2000)
Solving Panorama Problem
with Markov Random Fields
Graph-Cuts (alpha expansion), 2001
Photosynthesis
Solving Photosynthesis Problems
with Alpha-matting (2000s-now)
Object Detection & Classification
state-of-the-art
• ImageNet Large Scale Visual
Recognition Challenge (ILSVRC)
• 1000+ classes, 1.2M images.
0
0.125
0.25
0.375
0.5
11 12 13 14 11 12 13 14
classification
error
classification
+localization error
Image Scene Classification
• MIT Places 401
dataset.
• top-5 accuracy
rates >80%.
Self-driving cars (2000s-now)
DARPA Grand Challenge
(2005)
2005 winner, Stanley (Stanford),
3mph through desert
DARPA Urban Challenge
(2007)
2007 winner, Boss (CMU),
13mpg through the city
Self Driving Cadillac, US
congressman to airport, 2013
Google Self Driving Car,
2015
Google Self-Driving Car,
2016
Google Self-Driving Car, 2016
NVidia Self Driving Car, 2016
How did we come this far?
Race car drivers know the trick
Focus on Free Space /
Drivable Area, not Obstacles!
Up-and-coming
Computer Vision
Applications
Structure from X, Floored
Structure from X, PIX4D
Object Recognition
Blue River Technology
Augmented Reality
Magic Leap
IMRSV
Retail Insights
Source:
Prism Skylabs
Other Applications in
Business Intelligence
• Measure brand exposure.
• Measure sponsorship effectiveness.
• Loss prevention and retail layout optimization.
How about Smart
Surveillance?
Angel.co
My humble attempts at
putting the latest Computer
Vision algorithms to work
Intrinsic Imaging at Tandent
Vision Science
Computer Vision would be half-solved without shadows!
LightOriginal Image Surface
Tandent Lightbrush
Video Tutorial for Tandent Lightbrush: https://vimeo.com/47009123
Issues
• Highly anticipated, highly acclaimed, but small
crowd at $500 a license.
• Adobe Photoshop monopoly and the “not
invented here” syndrome.
• Adobe’s arch-rival, Corel (Corel Draw, Paint
Shop Pro, Ulead PhotoImpact) was DYING and
asked too much from the botched deal.
Have fun scribbling out your
shadows in photoshop!
Poor Bob from Adobe wasted 9 minutes removing just 1 shadow
Intrinsic Imaging for improving the
RGB signal in autonomous driving
Intrinsic Imaging’s other
applications
Retrospect
• 20 researchers burned 25 million in 8 years;
investors got 50 patents in return, period.
• Overestimated the total addressable market
size, in a market with existing monopoly.
• Many missed opportunities. Counterexample of
the lean startup model.
Some SfM, SLAM startups
Satellite/Aerial Imagery Analysis
• 40cm resolution at 30fps for 90 sec for any location on earth.
• One LEO satellite revisits any place on Earth every 3 days.
• Need 24 satellites to revisit any place on Earth every 3 hours.
Challenges for Single satellite depth
estimation and 3D reconstruction
• At 30fps, a LEO satellite
travels 250m between two
consecutive frames —>
theoretically sufficient for
cm-level depth estimation.
• Sources of Noise:
• Camera distortions
• Atmospheric Disturbance
• Ground vegetation
• Sub-pixel sampling noise
1
2
What happened?
• B2B customers takes too long to strike deals.
• Google ate us alive in just 3 months, while we
were still pitching for VC-funding with our
prototype.
Visual Search at Nervve
Retrospect
• Growth pains expanding from intelligence
community clients to advertisement clients.
• Forming the right team of engineers and
researchers and moving at the right pace.
• For any Computer Vision/Machine Learning
company:
• Researchers that cannot program—> OUT
• Engineers that don’t know math —> OUT
Visual Search, Simply Smarter
Once in a lifetime opportunity in
China’s video streaming market
What do we need?
Face Motion
Image
scene Text Audio Object
Semantics
Viscovery VDS (Video Discovery Service)
Viscovery VDS (Video Discovery Service)
Viscovery VDS (Video Discovery Service)
Challenges Encountered
Along the Way
• From Product Recognition in Images, to Face,
Logo, Object, Scene recognition in Videos.
• Number of Categories
• Recognition Accuracy
• Recognition Speed
• System Architecture
• Business Model
Viscovery’s Edge
• Market: first mover’s advantage in China’s video
streaming market.
• Speed: we built the whole VDS thing in a few months!
• Team: You! Seriously!
• Technology:
• Depth
• Breadth
• Cloud
• Customizability
• Self-Learning
Life is not all rosy at startups
• High Risk, High Pressure, High Uncertainty!
• Resources are scarce, but you MUST DELIVER!
• Forming your all-star team is not that easy…
• Focus, and persistence.
What can Taiwan’s academia
do to help bridge the gap?
HMM….
Academia
IndustryGeneral Public
reputation and
policy support
improved
living standards
students
opportunity
well-trained
graduates
grants and
collaborations
A healthy cycle
Academia
IndustryGeneral Public
unsupportive
policies
stagnant wages
useless
education
unemployable
graduates
A vicious cycle
no grants
no students
Where should we start?
Maybe with a few more stories.
Where should we start?
Maybe with a few more stories.
Where should we start?
Maybe with a few more stories.
The Goldilocks zone of innovation
The Goldilocks zone of innovation
Business
Relevance
Academic
Relevance
plentiful resources; hierarchical organization
lack of resources; responsive organization
traditional corporations
talking “innovation”
corporate research
startups struggling to survive
academic spinoffs
MSR
Thank You!
albert@viscovery.com

More Related Content

What's hot

許永真/Crowd Computing for Big and Deep AI
許永真/Crowd Computing for Big and Deep AI許永真/Crowd Computing for Big and Deep AI
許永真/Crowd Computing for Big and Deep AI台灣資料科學年會
 
S.P.A.C.E. Exploration for Software Engineering
 S.P.A.C.E. Exploration for Software Engineering S.P.A.C.E. Exploration for Software Engineering
S.P.A.C.E. Exploration for Software EngineeringCS, NcState
 
Welcome to almaden 20140904 v12 short
Welcome to almaden 20140904 v12 shortWelcome to almaden 20140904 v12 short
Welcome to almaden 20140904 v12 shortISSIP
 
從 Project Theta 到台灣人工智慧學校
從 Project Theta 到台灣人工智慧學校從 Project Theta 到台灣人工智慧學校
從 Project Theta 到台灣人工智慧學校Sheng-Wei (Kuan-Ta) Chen
 
李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning台灣資料科學年會
 
Visualization Evaluation of the Masses, by the Masses, and for the Masses.
Visualization Evaluation of the Masses, by the Masses, and for the Masses.Visualization Evaluation of the Masses, by the Masses, and for the Masses.
Visualization Evaluation of the Masses, by the Masses, and for the Masses.BELIV Workshop
 
MixTaiwan 20170222 清大電機 孫民 AI The Next Big Thing
MixTaiwan 20170222 清大電機 孫民 AI The Next Big ThingMixTaiwan 20170222 清大電機 孫民 AI The Next Big Thing
MixTaiwan 20170222 清大電機 孫民 AI The Next Big ThingMix Taiwan
 
Bringing AI to Business Intelligence
Bringing AI to Business IntelligenceBringing AI to Business Intelligence
Bringing AI to Business IntelligenceSi Krishan
 
Nutanix Event - Watson AI Presentation
Nutanix Event - Watson AI PresentationNutanix Event - Watson AI Presentation
Nutanix Event - Watson AI PresentationPhil Salm
 
Mathematics, Machine Learning and ML Engineering
Mathematics, Machine Learning and ML EngineeringMathematics, Machine Learning and ML Engineering
Mathematics, Machine Learning and ML EngineeringGopi Krishna Nuti
 
data scientist the sexiest job of the 21st century
data scientist the sexiest job of the 21st centurydata scientist the sexiest job of the 21st century
data scientist the sexiest job of the 21st centuryFrank Kienle
 
Emerging trends in Artificial intelligence - A deeper review
Emerging trends in Artificial intelligence - A deeper reviewEmerging trends in Artificial intelligence - A deeper review
Emerging trends in Artificial intelligence - A deeper reviewGopi Krishna Nuti
 
실리콘밸리의 한국인 2015 - 권기태 발표
실리콘밸리의 한국인 2015 - 권기태 발표실리콘밸리의 한국인 2015 - 권기태 발표
실리콘밸리의 한국인 2015 - 권기태 발표StartupAlliance
 

What's hot (16)

許永真/Crowd Computing for Big and Deep AI
許永真/Crowd Computing for Big and Deep AI許永真/Crowd Computing for Big and Deep AI
許永真/Crowd Computing for Big and Deep AI
 
S.P.A.C.E. Exploration for Software Engineering
 S.P.A.C.E. Exploration for Software Engineering S.P.A.C.E. Exploration for Software Engineering
S.P.A.C.E. Exploration for Software Engineering
 
Welcome to almaden 20140904 v12 short
Welcome to almaden 20140904 v12 shortWelcome to almaden 20140904 v12 short
Welcome to almaden 20140904 v12 short
 
從 Project Theta 到台灣人工智慧學校
從 Project Theta 到台灣人工智慧學校從 Project Theta 到台灣人工智慧學校
從 Project Theta 到台灣人工智慧學校
 
李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning
 
Visualization Evaluation of the Masses, by the Masses, and for the Masses.
Visualization Evaluation of the Masses, by the Masses, and for the Masses.Visualization Evaluation of the Masses, by the Masses, and for the Masses.
Visualization Evaluation of the Masses, by the Masses, and for the Masses.
 
MixTaiwan 20170222 清大電機 孫民 AI The Next Big Thing
MixTaiwan 20170222 清大電機 孫民 AI The Next Big ThingMixTaiwan 20170222 清大電機 孫民 AI The Next Big Thing
MixTaiwan 20170222 清大電機 孫民 AI The Next Big Thing
 
Bringing AI to Business Intelligence
Bringing AI to Business IntelligenceBringing AI to Business Intelligence
Bringing AI to Business Intelligence
 
Image analytics - A Primer
Image analytics - A PrimerImage analytics - A Primer
Image analytics - A Primer
 
Nutanix Event - Watson AI Presentation
Nutanix Event - Watson AI PresentationNutanix Event - Watson AI Presentation
Nutanix Event - Watson AI Presentation
 
Mathematics, Machine Learning and ML Engineering
Mathematics, Machine Learning and ML EngineeringMathematics, Machine Learning and ML Engineering
Mathematics, Machine Learning and ML Engineering
 
Ml - A shallow dive
Ml  - A shallow diveMl  - A shallow dive
Ml - A shallow dive
 
JLL-V3
JLL-V3JLL-V3
JLL-V3
 
data scientist the sexiest job of the 21st century
data scientist the sexiest job of the 21st centurydata scientist the sexiest job of the 21st century
data scientist the sexiest job of the 21st century
 
Emerging trends in Artificial intelligence - A deeper review
Emerging trends in Artificial intelligence - A deeper reviewEmerging trends in Artificial intelligence - A deeper review
Emerging trends in Artificial intelligence - A deeper review
 
실리콘밸리의 한국인 2015 - 권기태 발표
실리콘밸리의 한국인 2015 - 권기태 발표실리콘밸리의 한국인 2015 - 권기태 발표
실리콘밸리의 한국인 2015 - 권기태 발표
 

Viewers also liked

Improving Spatiotemporal Stability for Object Detection and Classification
Improving Spatiotemporal Stability for Object Detection and ClassificationImproving Spatiotemporal Stability for Object Detection and Classification
Improving Spatiotemporal Stability for Object Detection and ClassificationAlbert Y. C. Chen
 
人工智慧下的AOI變革浪潮:影像辨識技術的突破與新契機
人工智慧下的AOI變革浪潮:影像辨識技術的突破與新契機人工智慧下的AOI變革浪潮:影像辨識技術的突破與新契機
人工智慧下的AOI變革浪潮:影像辨識技術的突破與新契機Albert Y. C. Chen
 
Practical computer vision-- A problem-driven approach towards learning CV/ML/DL
Practical computer vision-- A problem-driven approach towards learning CV/ML/DLPractical computer vision-- A problem-driven approach towards learning CV/ML/DL
Practical computer vision-- A problem-driven approach towards learning CV/ML/DLAlbert Y. C. Chen
 
Business Models for AI startups
Business Models for AI startupsBusiness Models for AI startups
Business Models for AI startupsAlbert Y. C. Chen
 
擁抱人工智慧帶來的劇烈產業改變 @ Mix Taiwan
擁抱人工智慧帶來的劇烈產業改變 @ Mix Taiwan擁抱人工智慧帶來的劇烈產業改變 @ Mix Taiwan
擁抱人工智慧帶來的劇烈產業改變 @ Mix TaiwanAlbert Y. C. Chen
 

Viewers also liked (6)

Improving Spatiotemporal Stability for Object Detection and Classification
Improving Spatiotemporal Stability for Object Detection and ClassificationImproving Spatiotemporal Stability for Object Detection and Classification
Improving Spatiotemporal Stability for Object Detection and Classification
 
人工智慧下的AOI變革浪潮:影像辨識技術的突破與新契機
人工智慧下的AOI變革浪潮:影像辨識技術的突破與新契機人工智慧下的AOI變革浪潮:影像辨識技術的突破與新契機
人工智慧下的AOI變革浪潮:影像辨識技術的突破與新契機
 
Practical computer vision-- A problem-driven approach towards learning CV/ML/DL
Practical computer vision-- A problem-driven approach towards learning CV/ML/DLPractical computer vision-- A problem-driven approach towards learning CV/ML/DL
Practical computer vision-- A problem-driven approach towards learning CV/ML/DL
 
影音大數據商機挖掘
影音大數據商機挖掘影音大數據商機挖掘
影音大數據商機挖掘
 
Business Models for AI startups
Business Models for AI startupsBusiness Models for AI startups
Business Models for AI startups
 
擁抱人工智慧帶來的劇烈產業改變 @ Mix Taiwan
擁抱人工智慧帶來的劇烈產業改變 @ Mix Taiwan擁抱人工智慧帶來的劇烈產業改變 @ Mix Taiwan
擁抱人工智慧帶來的劇烈產業改變 @ Mix Taiwan
 

Similar to The Opportunities and Challenges of Putting the Latest Computer Vision and Deep Learning Algorithms to Work

Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision Chen Sagiv
 
Materi_01_VK_2223_3.pdf
Materi_01_VK_2223_3.pdfMateri_01_VK_2223_3.pdf
Materi_01_VK_2223_3.pdfichsan6
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction Wael Badawy
 
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)Thilo Stadelmann
 
Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra MalikDeep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra MalikThe Hive
 
01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Visionbutest
 
Computer Vision Crash Course
Computer Vision Crash CourseComputer Vision Crash Course
Computer Vision Crash CourseJia-Bin Huang
 
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...Skolkovo Robotics Center
 
Overview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear IndustryOverview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear IndustryTanvir Moin
 
Detection and recognition of face using neural network
Detection and recognition of face using neural networkDetection and recognition of face using neural network
Detection and recognition of face using neural networkSmriti Tikoo
 
University of florida 3 d lapidary scanner 110614
University of florida 3 d lapidary scanner 110614University of florida 3 d lapidary scanner 110614
University of florida 3 d lapidary scanner 110614Robert Harker
 
Big Data LDN 2017: Deep Learning Demystified
Big Data LDN 2017: Deep Learning DemystifiedBig Data LDN 2017: Deep Learning Demystified
Big Data LDN 2017: Deep Learning DemystifiedMatt Stubbs
 
COMP 4010 Lecture12 Research Directions in AR
COMP 4010 Lecture12 Research Directions in ARCOMP 4010 Lecture12 Research Directions in AR
COMP 4010 Lecture12 Research Directions in ARMark Billinghurst
 
Colombia 20140326 v1
Colombia 20140326 v1Colombia 20140326 v1
Colombia 20140326 v1ISSIP
 
Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?klschoef
 
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...Tulipp. Eu
 
AI for SDGs and International Development - Basics of AI
AI for SDGs and International Development - Basics of AIAI for SDGs and International Development - Basics of AI
AI for SDGs and International Development - Basics of AIAtsushi Koshio
 

Similar to The Opportunities and Challenges of Putting the Latest Computer Vision and Deep Learning Algorithms to Work (20)

Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision
 
Materi_01_VK_2223_3.pdf
Materi_01_VK_2223_3.pdfMateri_01_VK_2223_3.pdf
Materi_01_VK_2223_3.pdf
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction
 
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)
 
Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra MalikDeep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik
 
01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision
 
Gesture detection
Gesture detectionGesture detection
Gesture detection
 
Computer Vision Crash Course
Computer Vision Crash CourseComputer Vision Crash Course
Computer Vision Crash Course
 
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...
 
Overview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear IndustryOverview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear Industry
 
Introduction
IntroductionIntroduction
Introduction
 
Detection and recognition of face using neural network
Detection and recognition of face using neural networkDetection and recognition of face using neural network
Detection and recognition of face using neural network
 
University of florida 3 d lapidary scanner 110614
University of florida 3 d lapidary scanner 110614University of florida 3 d lapidary scanner 110614
University of florida 3 d lapidary scanner 110614
 
Big Data LDN 2017: Deep Learning Demystified
Big Data LDN 2017: Deep Learning DemystifiedBig Data LDN 2017: Deep Learning Demystified
Big Data LDN 2017: Deep Learning Demystified
 
COMP 4010 Lecture12 Research Directions in AR
COMP 4010 Lecture12 Research Directions in ARCOMP 4010 Lecture12 Research Directions in AR
COMP 4010 Lecture12 Research Directions in AR
 
Lec 02
Lec 02Lec 02
Lec 02
 
Colombia 20140326 v1
Colombia 20140326 v1Colombia 20140326 v1
Colombia 20140326 v1
 
Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?
 
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
 
AI for SDGs and International Development - Basics of AI
AI for SDGs and International Development - Basics of AIAI for SDGs and International Development - Basics of AI
AI for SDGs and International Development - Basics of AI
 

More from Albert Y. C. Chen

Building ML models for smart retail
Building ML models for smart retailBuilding ML models for smart retail
Building ML models for smart retailAlbert Y. C. Chen
 
為何VC不投資我的AI新創?
為何VC不投資我的AI新創?為何VC不投資我的AI新創?
為何VC不投資我的AI新創?Albert Y. C. Chen
 
數據特性 vs AI產品設計與實作
數據特性 vs AI產品設計與實作數據特性 vs AI產品設計與實作
數據特性 vs AI產品設計與實作Albert Y. C. Chen
 
AI創新創業的商業模式與專案風險管理
AI創新創業的商業模式與專案風險管理AI創新創業的商業模式與專案風險管理
AI創新創業的商業模式與專案風險管理Albert Y. C. Chen
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersAlbert Y. C. Chen
 
用AI創造大商機:媒體、廣告、電商、零售業的視覺辨識應用
用AI創造大商機:媒體、廣告、電商、零售業的視覺辨識應用用AI創造大商機:媒體、廣告、電商、零售業的視覺辨識應用
用AI創造大商機:媒體、廣告、電商、零售業的視覺辨識應用Albert Y. C. Chen
 
Machine Learning Foundations
Machine Learning FoundationsMachine Learning Foundations
Machine Learning FoundationsAlbert Y. C. Chen
 
AI智慧服務推動經驗分享
AI智慧服務推動經驗分享AI智慧服務推動經驗分享
AI智慧服務推動經驗分享Albert Y. C. Chen
 
媒體、影視產業、AI新創
媒體、影視產業、AI新創媒體、影視產業、AI新創
媒體、影視產業、AI新創Albert Y. C. Chen
 

More from Albert Y. C. Chen (9)

Building ML models for smart retail
Building ML models for smart retailBuilding ML models for smart retail
Building ML models for smart retail
 
為何VC不投資我的AI新創?
為何VC不投資我的AI新創?為何VC不投資我的AI新創?
為何VC不投資我的AI新創?
 
數據特性 vs AI產品設計與實作
數據特性 vs AI產品設計與實作數據特性 vs AI產品設計與實作
數據特性 vs AI產品設計與實作
 
AI創新創業的商業模式與專案風險管理
AI創新創業的商業模式與專案風險管理AI創新創業的商業模式與專案風險管理
AI創新創業的商業模式與專案風險管理
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
用AI創造大商機:媒體、廣告、電商、零售業的視覺辨識應用
用AI創造大商機:媒體、廣告、電商、零售業的視覺辨識應用用AI創造大商機:媒體、廣告、電商、零售業的視覺辨識應用
用AI創造大商機:媒體、廣告、電商、零售業的視覺辨識應用
 
Machine Learning Foundations
Machine Learning FoundationsMachine Learning Foundations
Machine Learning Foundations
 
AI智慧服務推動經驗分享
AI智慧服務推動經驗分享AI智慧服務推動經驗分享
AI智慧服務推動經驗分享
 
媒體、影視產業、AI新創
媒體、影視產業、AI新創媒體、影視產業、AI新創
媒體、影視產業、AI新創
 

Recently uploaded

basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomyDrAnita Sharma
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Forensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxForensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxkumarsanjai28051
 
Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)Tamer Koksalan, PhD
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxnoordubaliya2003
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 

Recently uploaded (20)

basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomy
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Forensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxForensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptx
 
Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptx
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 

The Opportunities and Challenges of Putting the Latest Computer Vision and Deep Learning Algorithms to Work

  • 1. Putting the latest Computer Vision and Deep Learning algorithms to work The Opportunities and Challenges Albert Y. C. Chen, Ph.D. Vice President, R&D Viscovery
  • 2. Albert Y. C. Chen, Ph.D. • Experience 2017-present: Vice President of R&D at Viscovery 2016-2017: Chief Scientist at Viscovery 2015: Principal Scientist @ Nervve Technologies 2013-2014 Computer Vision Scientist @ Tandent Vision 2011-2012 @ GE Global Research • Education Ph.D. in Computer Science, SUNY-Buffalo M.S. in Computer Science, NTNU B.S. in Computer Science, NTHU • Some random things about me… SUNY Excellence in Teaching Award, 2010. Some rapid promotions, some failed startups, some patents, some papers…
  • 3. 1. W.Wu,A.Y. C. Chen, L. Zhao, and J. J. Corso. Brain tumor detection and segmentation in a CRF framework with pixel-wise affinity and superpixel-level features. International Journal of Computer Assisted Radiology and Surgery, 2015. 2. S. N. Lim,A.Y. C. Chen and X.Yang. Parameter Inference Engine (PIE) on the Pareto Front. In Proceedings of International Conference of Machine Learning,Auto ML Workshop, 2014. 3. A.Y. C. Chen, S.Whitt, C. Xu, and J. J. Corso. Hierarchical supervoxel fusion for robust pixel label propagation in videos. In Submission to ACM Multimedia, 2013. 4. A.Y.C. Chen and J.J. Corso.Temporally consistent multi-class video-object segmentation with the video graph-shifts algorithm. In Proceedings of IEEE Workshop on Applications of ComputerVision, 2011. 5. D.R. Schlegel,A.Y.C. Chen, C. Xiong, J.A. Delmerico, and J.J. Corso. Airtouch: Interacting with computer systems at a distance. In Proceedings of IEEE Workshop on Applications of ComputerVision, 2011. 6. A.Y.C. Chen and J.J. Corso. On the effects of normalization in adaptive MRF Hierarchies. In Proceedings of International Symposium CompIMAGE, 2010. 7. A.Y.C. Chen and J.J. Corso. Propagating multi-class pixel labels throughout video frames. In Proceedings of IEEE Western NewYork Image Processing Workshop, 2010. 8. A.Y. C. Chen and J. J. Corso. On the effects of normalization in adaptive MRF Hierarchies. Computational Modeling of Objects Represented in Images, pages 275–286, 2010. 9. Y.Tao, L. Lu, M. Dewan,A.Y. C. Chen, J. J. Corso, J. Xuan, M. Salganicoff, and A. Krishnan. Multi-level ground glass nodule detection and segmentation in ct lung images. Medical Image Computing and Computer-Assisted Intervention, 2009. 10. A.Y.C. Chen, J.J. Corso, and L.Wang. Hops: Efficient region labeling using higher order proxy neighborhoods. In Proceedings of IEEE International Conference on Pattern Recognition, 2008.
  • 4. Some work done before I caught the startup fever Freestyle Sketching Stage AirTouch waits in background for the initialization signal Initialize Terminate Output image database Start: Results CBIR query Airtouch HCI interface for Content-based Image Retrieval
  • 5. Interactive Segmentation & Classification • Segmentation then classification: • computationally more efficient, • results in much higher classification accuracy. • Pioneered the “pixel label propagation” field. • First to utilize superpixels and supervoxels for the task. FG Traditional Spatial Propagation Pixel label map Label a subset of pixels BG Spatio-temporal Propagation time
  • 6. Image/Video Object Recognition and Content Understanding approaches person carries gives recieves Ontology object Person 1 Person 1Person 2 High-Level Mid-Level approach activity receives gives carries activity activity activity Time Reasoning x x x Low-Level x x x x
  • 7. Learning and Adapting Optimal Classifier Parameters subspace B subspace A subspace C Image-level feature space priors Patch-level feature space posterior probability suggest optimal parameter configuration
  • 8. Graphical Models and Stochastic Optimization A (a) The space-time volume of a video showing the objects (A--F) and their appearing time-span. space time A B C D E F B E F C D (b) The temporal relationship graph. An edge between two vertices mean that the two objects overlap in time. (c) The goal is: cover all objects with the smallest number of "ground truth key frames". space time A B C D E F key 1 key 2 A B E F C D (d) This translates to: iteratively solving the max clique problem until all vertices belong to a clique. A B E F C D key 2 key 1 frame t-1 frame t layer n layer n layer n+1 layer n+1 Temporal Shift Shift µ
  • 9. Medical Imaging and Geospatial Imaging GNN detection and segmentation in Lung CT geospatial imaging: building detection Brain tumor detection and segmentation in MR images.
  • 10. Why Risk to Innovate? • Good business model NEVER last forever. • Average “shelf life” on S&P 500: 20 years. • 100-year old companies constantly reinvent themselves every 10-20 years • Startups contribute to 20% of USA’s GDP.
  • 11. The Death of a Good Business Model • Foxconn 20 year revenue v.s. net profit (now at 5%)
  • 12. What do 100 year old corporations do? GE Schenectady, 1896
  • 13. History of change at GE • 1886: one of the 12 original companies on the Dow Jone Industrial Average (also the only one remaining). • 1889: lightbulbs • 1919: radios • 1927: TV • 1941: jet engine • 1960: nuclear power • 1971: room AC units • 1995: MRI
  • 14. History of change at IBM • 1960s: mainframe computer • 1980s: personal computer • 2000s: integrated solutions • 2020s: AI, Watson
  • 15. How about the leading Semiconductor companies?
  • 16. NVidia reinventing itself —2 times in 20 years
  • 17. “Bad money drives out good” in the desktop GPU market
  • 18. The rise of mobile computing, and how NVidia missed the boat!
  • 19. NVidia’s Tegra mobile processors never took off then, the market saturated…
  • 20. NVidia not just survived. NVidia is thriving!
  • 21. Meet the new NVidia: Deep Learning, Deep Learning, and still, Deep Learning
  • 22.
  • 23. The king is dead, long live the king!
  • 24. Now, again, do we want to do OEM/ODM forever? Optimizing an old business model is just delaying its eventual death.
  • 25. Startups • A company, partnership, or temporary organization designed to search for a new, repeatable and scalable business model.
  • 26. Your Idea • Are you passionate about it? • Is it disruptive enough? • What is your business plan? • What is it? • Can it make money? • What is the future of the idea? • What is your competitive advantage? • How do you build up your entry barrier?
  • 27. A minimal startup team • A hacker • A hustler • A hipster
  • 28.
  • 30. Prototype • Hack out a prototype • Spend 2-10 weeks max. • Investors are much more likely to fund you if you have a minimal initial version of your idea. • Hackathons are a good place to start. • Iteratively improve the prototype
  • 32.
  • 33. Buildup your entry barrier! • Market (users) • Speed • Team • Technology
  • 34. Building entry barrier with Technology!!
  • 35. Computer Vision, it can’t be that hard, right?
  • 36. Brief History Marvin Minsky “In 1966, Minsky hired a first-year undergraduate student and assigned him a problem to solve over the summer: connect a television camera to a computer and get the machine to describe what it sees.” Gerald Sussman The student never worked on Computer Vision problems again.
  • 37. Brief History • 1960’s: interpretation of synthetic worlds • 1970’s: some progress on interpreting selected images • 1980’s: ANNs come and go; shift toward geometry and increased mathematical rigor • 1990’s: face recognition; statistical analysis in vogue • 2000’s: broader recognition; large annotated datasets available; video processing starts Guzman ‘68 Ohta Kanade ‘78 Turk and Pentland ‘91
  • 38. What’s in our arsenal? • Image filters • Feature descriptors • Classifiers
  • 45. Features: Laplacian of Gaussian (LoG; scale detection)
  • 46. Features: Orientation How to compute the rotation? Create edge orientation histogram and find peak.
  • 55. What alg. should I use then? • How much data do we have? • What objects are we trying to detect? • For example, Google’s DNN trained with 11k images over 20 classes in 2013 doesn’t always beat DPM. 0 0.15 0.3 0.45 0.6 aero bike bird boat bottle bus car cat chair cow 0 0.15 0.3 0.45 0.6 dog horse m-bike person plant sheep sofa table train TV D N N D P M
  • 56. ML alg. and their Applications • Deep Learning • Markovian/ Bayesian • Feature Matching • Other ML methods
  • 57. Meta-Learning • Different use cases calls for different ML algorithms. • Meta-Learning: learning how to learn. • Requires plenty of domain-specific know-how.
  • 59. • Final inspection cells • Robot guidance and checking orientation of components • Packaging Inspection • Medical vial inspection • Food pack checks • Verifying engineered components[5] • Wafer Dicing • Reading of Serial Numbers • Inspection of Saw Blades • Inspection of Ball Grid Arrays (BGAs) • Surface Inspection • Measuring of Spark Plugs • Molding Flash Detection • Inspection of Punched Sheets • 3D Plane Reconstruction with Stereo • Pose Verification of Resistors • Classification of Non- Woven Fabrics 1970s-now: Machine Vision for Industrial Inspection • Automated Train Examiner (ATEx) Systems • Automatic PCB inspection • Wood quality inspection • Final inspection of sub-assemblies • Engine part inspection • Label inspection on products • Checking medical devices for defects
  • 60. Industrial Inspection: turbofan jet engine blade maintenance • Some seemingly daunting machine vision tasks actually works with relatively simple image processing algorithms.
  • 64. License Plate Readers with Text Detection and Neural Networks
  • 67. Face Recognition (1990s-now) • Face Detection (Viola and Jones, 2001) • Face Verification (1:1) v.s. Identification (1:N)
  • 68. Face Verification and Identification,
 Labeled Faces in the Wild (LFW) Recognition Accuracy: • 1 to 1: 99%+ • 1 to 100: 90% • 1 to 10,000: 50%-70%. • 1 to 1M: 30%. LFW dataset, common FN↑, FP↓
  • 69.
  • 70. Sports—NFL first down line (1995-now)
  • 71. Sports—NFL first down line minus equals
  • 72. 3D Reconstruction (As old as CV; became practical since SIFT)
  • 73. 3D Reconstruction with Feature Matching, Structure from Motion
  • 74. 3D Reconstruction with Feature Matching, Structure from Motion
  • 76. Solving Panorama Problem with Markov Random Fields Input:
  • 77. Solving Panorama Problem with Markov Random Fields Input:
  • 78. Solving Panorama Problem with Markov Random Fields Input:
  • 79. Solving Panorama Problem with Markov Random Fields Input:
  • 80. Solving Panorama Problem with Markov Random Fields Input:
  • 81. Solving Panorama Problem with Markov Random Fields Input:
  • 82. Solving Panorama Problem with Markov Random Fields Input:
  • 83. Solving Panorama Problem with Markov Random Fields
  • 84. Solving Panorama Problem with Markov Random Fields ICM (Iterated Conditional Modes), 1986
  • 85. Solving Panorama Problem with Markov Random Fields Belief Propagation (1980-2000)
  • 86. Solving Panorama Problem with Markov Random Fields Graph-Cuts (alpha expansion), 2001
  • 88. Solving Photosynthesis Problems with Alpha-matting (2000s-now)
  • 89. Object Detection & Classification state-of-the-art • ImageNet Large Scale Visual Recognition Challenge (ILSVRC) • 1000+ classes, 1.2M images. 0 0.125 0.25 0.375 0.5 11 12 13 14 11 12 13 14 classification error classification +localization error
  • 90. Image Scene Classification • MIT Places 401 dataset. • top-5 accuracy rates >80%.
  • 93. 2005 winner, Stanley (Stanford), 3mph through desert
  • 94.
  • 96. 2007 winner, Boss (CMU), 13mpg through the city
  • 97. Self Driving Cadillac, US congressman to airport, 2013
  • 98. Google Self Driving Car, 2015
  • 101. NVidia Self Driving Car, 2016
  • 102. How did we come this far? Race car drivers know the trick
  • 103. Focus on Free Space / Drivable Area, not Obstacles!
  • 105. Structure from X, Floored
  • 109. IMRSV
  • 111. Other Applications in Business Intelligence • Measure brand exposure. • Measure sponsorship effectiveness. • Loss prevention and retail layout optimization.
  • 114. My humble attempts at putting the latest Computer Vision algorithms to work
  • 115. Intrinsic Imaging at Tandent Vision Science Computer Vision would be half-solved without shadows! LightOriginal Image Surface
  • 116. Tandent Lightbrush Video Tutorial for Tandent Lightbrush: https://vimeo.com/47009123
  • 117.
  • 118. Issues • Highly anticipated, highly acclaimed, but small crowd at $500 a license. • Adobe Photoshop monopoly and the “not invented here” syndrome. • Adobe’s arch-rival, Corel (Corel Draw, Paint Shop Pro, Ulead PhotoImpact) was DYING and asked too much from the botched deal.
  • 119. Have fun scribbling out your shadows in photoshop! Poor Bob from Adobe wasted 9 minutes removing just 1 shadow
  • 120. Intrinsic Imaging for improving the RGB signal in autonomous driving
  • 122. Retrospect • 20 researchers burned 25 million in 8 years; investors got 50 patents in return, period. • Overestimated the total addressable market size, in a market with existing monopoly. • Many missed opportunities. Counterexample of the lean startup model.
  • 123. Some SfM, SLAM startups
  • 124. Satellite/Aerial Imagery Analysis • 40cm resolution at 30fps for 90 sec for any location on earth. • One LEO satellite revisits any place on Earth every 3 days. • Need 24 satellites to revisit any place on Earth every 3 hours.
  • 125. Challenges for Single satellite depth estimation and 3D reconstruction • At 30fps, a LEO satellite travels 250m between two consecutive frames —> theoretically sufficient for cm-level depth estimation. • Sources of Noise: • Camera distortions • Atmospheric Disturbance • Ground vegetation • Sub-pixel sampling noise 1 2
  • 126. What happened? • B2B customers takes too long to strike deals. • Google ate us alive in just 3 months, while we were still pitching for VC-funding with our prototype.
  • 127. Visual Search at Nervve
  • 128. Retrospect • Growth pains expanding from intelligence community clients to advertisement clients. • Forming the right team of engineers and researchers and moving at the right pace. • For any Computer Vision/Machine Learning company: • Researchers that cannot program—> OUT • Engineers that don’t know math —> OUT
  • 130. Once in a lifetime opportunity in China’s video streaming market
  • 131. What do we need? Face Motion Image scene Text Audio Object Semantics
  • 132. Viscovery VDS (Video Discovery Service)
  • 133. Viscovery VDS (Video Discovery Service)
  • 134. Viscovery VDS (Video Discovery Service)
  • 135. Challenges Encountered Along the Way • From Product Recognition in Images, to Face, Logo, Object, Scene recognition in Videos. • Number of Categories • Recognition Accuracy • Recognition Speed • System Architecture • Business Model
  • 136. Viscovery’s Edge • Market: first mover’s advantage in China’s video streaming market. • Speed: we built the whole VDS thing in a few months! • Team: You! Seriously! • Technology: • Depth • Breadth • Cloud • Customizability • Self-Learning
  • 137. Life is not all rosy at startups • High Risk, High Pressure, High Uncertainty! • Resources are scarce, but you MUST DELIVER! • Forming your all-star team is not that easy… • Focus, and persistence.
  • 138. What can Taiwan’s academia do to help bridge the gap? HMM….
  • 139. Academia IndustryGeneral Public reputation and policy support improved living standards students opportunity well-trained graduates grants and collaborations A healthy cycle
  • 141. Where should we start? Maybe with a few more stories.
  • 142. Where should we start? Maybe with a few more stories.
  • 143. Where should we start? Maybe with a few more stories.
  • 144. The Goldilocks zone of innovation
  • 145. The Goldilocks zone of innovation Business Relevance Academic Relevance plentiful resources; hierarchical organization lack of resources; responsive organization traditional corporations talking “innovation” corporate research startups struggling to survive academic spinoffs MSR

Editor's Notes

  1. 翟本橋:never worked a single day in my life example: Tivo disrupts TV market / creates DVR market example: Facebook, Twitter disrupt online social networking example: FourSquare creates location-based "check in" ad market
  2. Agriculture Health Monitoring, Humanitarian Aid, Insurance Modeling, Oil Storage Monitoring, Natural Disaster Response, Oil&Gas Infrastructure Monitoring Financial Trading Intelligence, Mining and Logging Monitoring, Maritime Monitoring
  3. At 681km altitude, orbit speed 27065 km/h = 451 km/ minute = 7.52 km / sec = 250m / frame @ 30fps.