SlideShare a Scribd company logo
1 of 41
Download to read offline
#denatechcon
#denatechcon
Building HD Maps with Dashcams
Kosuke Kuzuoka
AI System Group
DeNA Co., Ltd.
#denatechcon
Agenda
• Who I am
• Our Goal
• Intro to DL and SfM
• 3D Point Reconstruction
• Recognizing Objects
• Putting It All Together
#denatechcon
Who I am
• Profile
• Kosuke Kuzuoka
• 22 years old
• Experience
• June 2018 - Present
AI Research Engineer at DeNA Co., Ltd.
• March 2017 - June 2018
R&D manager at CONCORE’S, inc.
• Interests
• Self Driving Cars
• Computer Vision
Facebook: https://www.facebook.com/kousuke.kuzuoka.9
LinkedIn: https://www.linkedin.com/in/kousuke-kuzuoka-4101ba160/
#denatechcon
What I have done before
Detecting objects from construction
plans using deep learning algorithms
Patent pending algorithm that I
developed for detecting pillars across
multiple tiled images
#denatechcon
Our Goal
● To create high definition maps at
a lower price
● 3D point reconstruction and
object detection in dashcam
images
● No use of expensive equipment,
such as LiDAR
https://medium.com/@surmenok/hd-maps-for-self-driving-cars-c41bc01e0d40
#denatechcon
Isn’t it like google maps?
● A map designed for humans
● It has useful information for
humans
● A map designed for machines
● It has useful information for cars,
such as where traffic signs exist
#denatechcon
Is it for self-driving cars?
● It’s extensively used in self-driving cars,
such as for localization and path planning
● Therefore, the location accuracy for HD
maps need to be within a few centimeters
● A self-driving car needs to know which
direction the lane is leading, where the
traffic signs are, etc.
https://www.youtube.com/watch?time_continue=207&v=EUq5DlPQdhg
#denatechcon
Introduction to Deep Learning
● The idea of deep learning has existed from the late 1950s, invented by Frank Rosenblatt.
● It was originally called Perceptron, and it was able to solve linearly separable problems.
● Later, it turned out that simple Perceptron wasn’t able to solve non-linearly separable
problems.
https://becominghuman.ai/deep-learning-made-easy-with-deep-cognition-403fbe445351
#denatechcon
Why is deep learning popular nowadays?
● Large scale datasets such as ImageNet have been made public for research purposes
● High computational resources such as GPU are more accessible than ever before
https://en.wikipedia.org/wiki/Nvidia
http://www.image-net.org/
#denatechcon
Okay, but what can you do with DL?
● Using deep learning, we can
solve object detection and
instance segmentation
problems
● Object detection detects
multiple objects in the image,
while instance segmentation
segments object boundaries
● Using deep learning, we can
solve image classification and
image localization problems
● Image classification classifies
what is in the image, while
image localization classifies
what and where in the image
https://medium.com/comet-app/review-of-deep-learning-algorithms-for-object-detection-c1f3d437b852
#denatechcon
Okay, let’s sum that up
• Deep learning is not new
• Data is important for deep learning
• High computational resources are necessary
• You can do so many things with deep learning
#denatechcon
Introduction to SfM
SfM stands for Structure from
Motion, and is an algorithm to
reconstruct 3D points (called
structure) from images taken
with different angles or positions
(called motion). Large scale
applications include for example
reconstructing all of Rome using
only images found on the web.
https://grail.cs.washington.edu/rome/rome_paper.pdf
#denatechcon
How does SfM work?
https://www.mathworks.com/help/vision/ug/structure-from-motion.html
● Extracts features from images. e.g.
corners or edges
● Matches the features in images taken
from different positions
● Calculates the corresponding points
in 3D coordinates using triangulation
● Calculates camera position and
optimizes reconstructed 3D points
#denatechcon
What can you do with SfM?
https://grail.cs.washington.edu/rome/rome_paper.pdf
It built a 3D representation of Rome within a day with images found on the web. It used
150k images, and the processing time was around 21 hours using 496 CPU cores.
#denatechcon
Let’s sum that up
• SfM can reconstruct 3D shapes from 2D images
• 3D representation of Rome can be built in a day
using images from the web
#denatechcon
So we have tools. What now?
● Dashcam images are used for reconstructing 3D points by SfM
● The same images are used for detecting objects in 2D space
● Both results are integrated to get 3D representations of each object
#denatechcon
3D Point Reconstruction
● Images are taken by driving in the
highlighted region in Minatomirai
● Dashcam images are used for SfM
and object detection
#denatechcon
Overall shape looks good
● a
● b
● c
● 3D modeling in relatively small
region in Minatomirai
● Reconstructed shape matches the
highlighted region in the map
#denatechcon
Slightly larger region, still good
● Red arrows indicate the direction
the car was driving
● The reconstructed shape matches
the highlighted region in the map
#denatechcon
Hooray, view from top is good
● SfM was applied in a larger region
in the Minatomirai area
● Overall shape still matches the map
#denatechcon
What about the closer view?
The detail of road markings and speed
limit signs can be found, though some
information is unnecessary
Lanes are reconstructed well on the left
side, but the the center lane markings on
the right are missing. This is caused by
the divider
#denatechcon
Some findings with SfM are:
• Reconstructed 3D points contain small details
• GPU can reduce the processing time significantly
• The more images, the better the result
#denatechcon
Recognizing Objects
● We chose Faster R-CNN for detecting
traffic signs
● Faster R-CNN was a state-of-the-art
detector in 2016
● Faster R-CNN is a really accurate object
detector when compared to other real-time
detectors, but it’s slower
https://arxiv.org/abs/1506.01497
#denatechcon
Objects are detected correctly
● Most of traffic signs are detected correctly, though
there is a small traffic sign missed by the detector
● The network predicts the category for each box,
and there are more than 100 categories to choose
from
#denatechcon
Another example for traffic signs
#denatechcon
What now for lane detection?
https://arxiv.org/abs/1802.05591
● We chose LaneNet published in 2018 as a lane detector
● LaneNet transforms an original image to a bird’s eye image with learned parameters
● It can detect multiple lane instances at real-time speed and high accuracy
#denatechcon
Deep learning can detect lanes!
● Different colors indicate different instances
● You can see that the lanes are detected correctly
● It can detect curved lanes as well, though they
aren’t in the image
#denatechcon
Another example for lane detection
#denatechcon
What about road markings?
Bird’s eye
transformation on
original image
Inverse transformation
on bird’s eye image
Faster R-CNN on
bird’s eye image
#denatechcon
Deep learning works for road markings!
● Road markings are detected correctly.
● It distinguishes the lane from the stop sign
● The detection fits objects, though not perfectly
#denatechcon
Another example for road markings
#denatechcon
The result is impressive
#denatechcon
Objects are detected precisely
#denatechcon
Let’s sum that up
• Traffic sign recognition with more than 100
categories can be solved with deep learning
• Deep learning works well on complicated tasks
such as lane and road marking detection
• The more data, the better the results
#denatechcon
Putting It All Together
● Green points indicate the region used for 3D
reconstruction
● The detection has to be done in frames where
the objects are highlighted in green
#denatechcon
Results are now integrated
We can get a 3D representation of
detected objects by integrating both
results. The final result will look like
image above.
#denatechcon
Now, objects are represented in 3D
● Detected traffic signs and road markings are
converted to 3D
● Each object has a 3D representation after
integrating both SfM and object detection results
#denatechcon
We are done!
● Reconstructed 3D view looking from top
● You can see the detected lanes and road
markings now have a 3D representation
#denatechcon
Using this technique, we could do:
• Automating process for map creation
• Creating HD maps for other services
• Detecting changes automatically
#denatechcon
Thanks!
#denatechcon
#denatechcon

More Related Content

What's hot

サイバーエージェントにおけるMLOpsに関する取り組み at PyDataTokyo 23
サイバーエージェントにおけるMLOpsに関する取り組み at PyDataTokyo 23サイバーエージェントにおけるMLOpsに関する取り組み at PyDataTokyo 23
サイバーエージェントにおけるMLOpsに関する取り組み at PyDataTokyo 23Masashi Shibata
 
Unityではじめるオープンワールド制作 エンジニア編
Unityではじめるオープンワールド制作 エンジニア編Unityではじめるオープンワールド制作 エンジニア編
Unityではじめるオープンワールド制作 エンジニア編Unity Technologies Japan K.K.
 
データに内在する構造をみるための埋め込み手法
データに内在する構造をみるための埋め込み手法データに内在する構造をみるための埋め込み手法
データに内在する構造をみるための埋め込み手法Tatsuya Shirakawa
 
ICCV 2019 論文紹介 (26 papers)
ICCV 2019 論文紹介 (26 papers)ICCV 2019 論文紹介 (26 papers)
ICCV 2019 論文紹介 (26 papers)Hideki Okada
 
文献紹介:Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows
文献紹介:Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows文献紹介:Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows
文献紹介:Swin Transformer: Hierarchical Vision Transformer Using Shifted WindowsToru Tamaki
 
[Ridge-i 論文よみかい] Wasserstein auto encoder
[Ridge-i 論文よみかい] Wasserstein auto encoder[Ridge-i 論文よみかい] Wasserstein auto encoder
[Ridge-i 論文よみかい] Wasserstein auto encoderMasanari Kimura
 
深層学習によるHuman Pose Estimationの基礎
深層学習によるHuman Pose Estimationの基礎深層学習によるHuman Pose Estimationの基礎
深層学習によるHuman Pose Estimationの基礎Takumi Ohkuma
 
[DL輪読会]EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning
[DL輪読会]EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning[DL輪読会]EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning
[DL輪読会]EdgeConnect: Generative Image Inpainting with Adversarial Edge LearningDeep Learning JP
 
Skip Connection まとめ(Neural Network)
Skip Connection まとめ(Neural Network)Skip Connection まとめ(Neural Network)
Skip Connection まとめ(Neural Network)Yamato OKAMOTO
 
SSII2018TS: 大規模深層学習
SSII2018TS: 大規模深層学習SSII2018TS: 大規模深層学習
SSII2018TS: 大規模深層学習SSII
 
コンピューテーショナルフォトグラフィ
コンピューテーショナルフォトグラフィコンピューテーショナルフォトグラフィ
コンピューテーショナルフォトグラフィNorishige Fukushima
 
CMA-ESサンプラーによるハイパーパラメータ最適化 at Optuna Meetup #1
CMA-ESサンプラーによるハイパーパラメータ最適化 at Optuna Meetup #1CMA-ESサンプラーによるハイパーパラメータ最適化 at Optuna Meetup #1
CMA-ESサンプラーによるハイパーパラメータ最適化 at Optuna Meetup #1Masashi Shibata
 
Attention-Guided GANについて
Attention-Guided GANについてAttention-Guided GANについて
Attention-Guided GANについてyohei okawa
 
ガイデットフィルタとその周辺
ガイデットフィルタとその周辺ガイデットフィルタとその周辺
ガイデットフィルタとその周辺Norishige Fukushima
 
ウェーブレット変換の基礎と応用事例:連続ウェーブレット変換を中心に
ウェーブレット変換の基礎と応用事例:連続ウェーブレット変換を中心にウェーブレット変換の基礎と応用事例:連続ウェーブレット変換を中心に
ウェーブレット変換の基礎と応用事例:連続ウェーブレット変換を中心にRyosuke Tachibana
 
論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative Models論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative ModelsSeiya Tokui
 
機械学習を民主化する取り組み
機械学習を民主化する取り組み機械学習を民主化する取り組み
機械学習を民主化する取り組みYoshitaka Ushiku
 
Colabをshellから使う
Colabをshellから使うColabをshellから使う
Colabをshellから使うKiyoshi SATOH
 
最適輸送入門
最適輸送入門最適輸送入門
最適輸送入門joisino
 
【論文読み会】BEiT_BERT Pre-Training of Image Transformers.pptx
【論文読み会】BEiT_BERT Pre-Training of Image Transformers.pptx【論文読み会】BEiT_BERT Pre-Training of Image Transformers.pptx
【論文読み会】BEiT_BERT Pre-Training of Image Transformers.pptxARISE analytics
 

What's hot (20)

サイバーエージェントにおけるMLOpsに関する取り組み at PyDataTokyo 23
サイバーエージェントにおけるMLOpsに関する取り組み at PyDataTokyo 23サイバーエージェントにおけるMLOpsに関する取り組み at PyDataTokyo 23
サイバーエージェントにおけるMLOpsに関する取り組み at PyDataTokyo 23
 
Unityではじめるオープンワールド制作 エンジニア編
Unityではじめるオープンワールド制作 エンジニア編Unityではじめるオープンワールド制作 エンジニア編
Unityではじめるオープンワールド制作 エンジニア編
 
データに内在する構造をみるための埋め込み手法
データに内在する構造をみるための埋め込み手法データに内在する構造をみるための埋め込み手法
データに内在する構造をみるための埋め込み手法
 
ICCV 2019 論文紹介 (26 papers)
ICCV 2019 論文紹介 (26 papers)ICCV 2019 論文紹介 (26 papers)
ICCV 2019 論文紹介 (26 papers)
 
文献紹介:Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows
文献紹介:Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows文献紹介:Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows
文献紹介:Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows
 
[Ridge-i 論文よみかい] Wasserstein auto encoder
[Ridge-i 論文よみかい] Wasserstein auto encoder[Ridge-i 論文よみかい] Wasserstein auto encoder
[Ridge-i 論文よみかい] Wasserstein auto encoder
 
深層学習によるHuman Pose Estimationの基礎
深層学習によるHuman Pose Estimationの基礎深層学習によるHuman Pose Estimationの基礎
深層学習によるHuman Pose Estimationの基礎
 
[DL輪読会]EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning
[DL輪読会]EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning[DL輪読会]EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning
[DL輪読会]EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning
 
Skip Connection まとめ(Neural Network)
Skip Connection まとめ(Neural Network)Skip Connection まとめ(Neural Network)
Skip Connection まとめ(Neural Network)
 
SSII2018TS: 大規模深層学習
SSII2018TS: 大規模深層学習SSII2018TS: 大規模深層学習
SSII2018TS: 大規模深層学習
 
コンピューテーショナルフォトグラフィ
コンピューテーショナルフォトグラフィコンピューテーショナルフォトグラフィ
コンピューテーショナルフォトグラフィ
 
CMA-ESサンプラーによるハイパーパラメータ最適化 at Optuna Meetup #1
CMA-ESサンプラーによるハイパーパラメータ最適化 at Optuna Meetup #1CMA-ESサンプラーによるハイパーパラメータ最適化 at Optuna Meetup #1
CMA-ESサンプラーによるハイパーパラメータ最適化 at Optuna Meetup #1
 
Attention-Guided GANについて
Attention-Guided GANについてAttention-Guided GANについて
Attention-Guided GANについて
 
ガイデットフィルタとその周辺
ガイデットフィルタとその周辺ガイデットフィルタとその周辺
ガイデットフィルタとその周辺
 
ウェーブレット変換の基礎と応用事例:連続ウェーブレット変換を中心に
ウェーブレット変換の基礎と応用事例:連続ウェーブレット変換を中心にウェーブレット変換の基礎と応用事例:連続ウェーブレット変換を中心に
ウェーブレット変換の基礎と応用事例:連続ウェーブレット変換を中心に
 
論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative Models論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative Models
 
機械学習を民主化する取り組み
機械学習を民主化する取り組み機械学習を民主化する取り組み
機械学習を民主化する取り組み
 
Colabをshellから使う
Colabをshellから使うColabをshellから使う
Colabをshellから使う
 
最適輸送入門
最適輸送入門最適輸送入門
最適輸送入門
 
【論文読み会】BEiT_BERT Pre-Training of Image Transformers.pptx
【論文読み会】BEiT_BERT Pre-Training of Image Transformers.pptx【論文読み会】BEiT_BERT Pre-Training of Image Transformers.pptx
【論文読み会】BEiT_BERT Pre-Training of Image Transformers.pptx
 

Similar to Building HD maps with dashcams

Synthetic Data and Graphics Techniques in Robotics
Synthetic Data and Graphics Techniques in RoboticsSynthetic Data and Graphics Techniques in Robotics
Synthetic Data and Graphics Techniques in RoboticsPrabindh Sundareson
 
Introduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingIntroduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingPreferred Networks
 
2 D3 D Concersion Swaggmedia
2 D3 D Concersion   Swaggmedia2 D3 D Concersion   Swaggmedia
2 D3 D Concersion SwaggmediaCraig Nobles
 
What is point cloud annotation?
What is point cloud annotation?What is point cloud annotation?
What is point cloud annotation?Annotation Support
 
3D Laser Scanning for Oil & Gas Facilities
3D Laser Scanning for Oil & Gas Facilities3D Laser Scanning for Oil & Gas Facilities
3D Laser Scanning for Oil & Gas FacilitiesYasser Eldegwy
 
Can We Make Maps from Videos? ~From AI Algorithm to Engineering for Continuou...
Can We Make Maps from Videos? ~From AI Algorithm to Engineering for Continuou...Can We Make Maps from Videos? ~From AI Algorithm to Engineering for Continuou...
Can We Make Maps from Videos? ~From AI Algorithm to Engineering for Continuou...DeNA
 
From 2D Map to Mobile 3D Mirror World
From 2D Map to Mobile 3D Mirror WorldFrom 2D Map to Mobile 3D Mirror World
From 2D Map to Mobile 3D Mirror WorldYu You
 
Mirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMeetupDataScienceRoma
 
Deep Learning Projects - Anomaly Detection Using Deep Learning
Deep Learning Projects - Anomaly Detection Using Deep LearningDeep Learning Projects - Anomaly Detection Using Deep Learning
Deep Learning Projects - Anomaly Detection Using Deep LearningDezyreAcademy
 
Driving Assistant Solutions with Android
Driving Assistant Solutions with AndroidDriving Assistant Solutions with Android
Driving Assistant Solutions with AndroidGiorgio Natili
 
“Accident Reconstruction” by Aleksis Liekna from Scope Technologies at Auto f...
“Accident Reconstruction” by Aleksis Liekna from Scope Technologies at Auto f...“Accident Reconstruction” by Aleksis Liekna from Scope Technologies at Auto f...
“Accident Reconstruction” by Aleksis Liekna from Scope Technologies at Auto f...DevClub_lv
 
CNTK Object Detection
CNTK Object DetectionCNTK Object Detection
CNTK Object DetectionAndy Huang
 
Enhanced real time semantic segmentation
Enhanced real time semantic segmentationEnhanced real time semantic segmentation
Enhanced real time semantic segmentationAkankshaRawat42
 
Mi 291 chapter 3 (reverse engineering)(1)
Mi 291 chapter 3 (reverse engineering)(1)Mi 291 chapter 3 (reverse engineering)(1)
Mi 291 chapter 3 (reverse engineering)(1)varun teja G.V.V
 
detailed experience
detailed experiencedetailed experience
detailed experienceBryan Yan
 
Desktop Softwares for Unmanned Aerial Systems(UAS))
Desktop Softwares for Unmanned Aerial Systems(UAS))Desktop Softwares for Unmanned Aerial Systems(UAS))
Desktop Softwares for Unmanned Aerial Systems(UAS))Kamal Shahi
 
Dataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsDataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsPetteriTeikariPhD
 

Similar to Building HD maps with dashcams (20)

Synthetic Data and Graphics Techniques in Robotics
Synthetic Data and Graphics Techniques in RoboticsSynthetic Data and Graphics Techniques in Robotics
Synthetic Data and Graphics Techniques in Robotics
 
Introduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingIntroduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable Rendering
 
2 D3 D Concersion Swaggmedia
2 D3 D Concersion   Swaggmedia2 D3 D Concersion   Swaggmedia
2 D3 D Concersion Swaggmedia
 
What is point cloud annotation?
What is point cloud annotation?What is point cloud annotation?
What is point cloud annotation?
 
3D Laser Scanning for Oil & Gas Facilities
3D Laser Scanning for Oil & Gas Facilities3D Laser Scanning for Oil & Gas Facilities
3D Laser Scanning for Oil & Gas Facilities
 
Photomodeler
PhotomodelerPhotomodeler
Photomodeler
 
Can We Make Maps from Videos? ~From AI Algorithm to Engineering for Continuou...
Can We Make Maps from Videos? ~From AI Algorithm to Engineering for Continuou...Can We Make Maps from Videos? ~From AI Algorithm to Engineering for Continuou...
Can We Make Maps from Videos? ~From AI Algorithm to Engineering for Continuou...
 
From 2D Map to Mobile 3D Mirror World
From 2D Map to Mobile 3D Mirror WorldFrom 2D Map to Mobile 3D Mirror World
From 2D Map to Mobile 3D Mirror World
 
Mirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image Processing
 
Deep Learning Projects - Anomaly Detection Using Deep Learning
Deep Learning Projects - Anomaly Detection Using Deep LearningDeep Learning Projects - Anomaly Detection Using Deep Learning
Deep Learning Projects - Anomaly Detection Using Deep Learning
 
Driving Assistant Solutions with Android
Driving Assistant Solutions with AndroidDriving Assistant Solutions with Android
Driving Assistant Solutions with Android
 
“Accident Reconstruction” by Aleksis Liekna from Scope Technologies at Auto f...
“Accident Reconstruction” by Aleksis Liekna from Scope Technologies at Auto f...“Accident Reconstruction” by Aleksis Liekna from Scope Technologies at Auto f...
“Accident Reconstruction” by Aleksis Liekna from Scope Technologies at Auto f...
 
CNTK Object Detection
CNTK Object DetectionCNTK Object Detection
CNTK Object Detection
 
UI and UX for Mobile Developers
UI and UX for Mobile DevelopersUI and UX for Mobile Developers
UI and UX for Mobile Developers
 
Enhanced real time semantic segmentation
Enhanced real time semantic segmentationEnhanced real time semantic segmentation
Enhanced real time semantic segmentation
 
Mi 291 chapter 3 (reverse engineering)(1)
Mi 291 chapter 3 (reverse engineering)(1)Mi 291 chapter 3 (reverse engineering)(1)
Mi 291 chapter 3 (reverse engineering)(1)
 
Career portfolio
Career portfolioCareer portfolio
Career portfolio
 
detailed experience
detailed experiencedetailed experience
detailed experience
 
Desktop Softwares for Unmanned Aerial Systems(UAS))
Desktop Softwares for Unmanned Aerial Systems(UAS))Desktop Softwares for Unmanned Aerial Systems(UAS))
Desktop Softwares for Unmanned Aerial Systems(UAS))
 
Dataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsDataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problems
 

Recently uploaded

Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitterShivangiSharma879191
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction managementMariconPadriquez1
 

Recently uploaded (20)

Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction management
 

Building HD maps with dashcams

  • 1. #denatechcon #denatechcon Building HD Maps with Dashcams Kosuke Kuzuoka AI System Group DeNA Co., Ltd.
  • 2. #denatechcon Agenda • Who I am • Our Goal • Intro to DL and SfM • 3D Point Reconstruction • Recognizing Objects • Putting It All Together
  • 3. #denatechcon Who I am • Profile • Kosuke Kuzuoka • 22 years old • Experience • June 2018 - Present AI Research Engineer at DeNA Co., Ltd. • March 2017 - June 2018 R&D manager at CONCORE’S, inc. • Interests • Self Driving Cars • Computer Vision Facebook: https://www.facebook.com/kousuke.kuzuoka.9 LinkedIn: https://www.linkedin.com/in/kousuke-kuzuoka-4101ba160/
  • 4. #denatechcon What I have done before Detecting objects from construction plans using deep learning algorithms Patent pending algorithm that I developed for detecting pillars across multiple tiled images
  • 5. #denatechcon Our Goal ● To create high definition maps at a lower price ● 3D point reconstruction and object detection in dashcam images ● No use of expensive equipment, such as LiDAR https://medium.com/@surmenok/hd-maps-for-self-driving-cars-c41bc01e0d40
  • 6. #denatechcon Isn’t it like google maps? ● A map designed for humans ● It has useful information for humans ● A map designed for machines ● It has useful information for cars, such as where traffic signs exist
  • 7. #denatechcon Is it for self-driving cars? ● It’s extensively used in self-driving cars, such as for localization and path planning ● Therefore, the location accuracy for HD maps need to be within a few centimeters ● A self-driving car needs to know which direction the lane is leading, where the traffic signs are, etc. https://www.youtube.com/watch?time_continue=207&v=EUq5DlPQdhg
  • 8. #denatechcon Introduction to Deep Learning ● The idea of deep learning has existed from the late 1950s, invented by Frank Rosenblatt. ● It was originally called Perceptron, and it was able to solve linearly separable problems. ● Later, it turned out that simple Perceptron wasn’t able to solve non-linearly separable problems. https://becominghuman.ai/deep-learning-made-easy-with-deep-cognition-403fbe445351
  • 9. #denatechcon Why is deep learning popular nowadays? ● Large scale datasets such as ImageNet have been made public for research purposes ● High computational resources such as GPU are more accessible than ever before https://en.wikipedia.org/wiki/Nvidia http://www.image-net.org/
  • 10. #denatechcon Okay, but what can you do with DL? ● Using deep learning, we can solve object detection and instance segmentation problems ● Object detection detects multiple objects in the image, while instance segmentation segments object boundaries ● Using deep learning, we can solve image classification and image localization problems ● Image classification classifies what is in the image, while image localization classifies what and where in the image https://medium.com/comet-app/review-of-deep-learning-algorithms-for-object-detection-c1f3d437b852
  • 11. #denatechcon Okay, let’s sum that up • Deep learning is not new • Data is important for deep learning • High computational resources are necessary • You can do so many things with deep learning
  • 12. #denatechcon Introduction to SfM SfM stands for Structure from Motion, and is an algorithm to reconstruct 3D points (called structure) from images taken with different angles or positions (called motion). Large scale applications include for example reconstructing all of Rome using only images found on the web. https://grail.cs.washington.edu/rome/rome_paper.pdf
  • 13. #denatechcon How does SfM work? https://www.mathworks.com/help/vision/ug/structure-from-motion.html ● Extracts features from images. e.g. corners or edges ● Matches the features in images taken from different positions ● Calculates the corresponding points in 3D coordinates using triangulation ● Calculates camera position and optimizes reconstructed 3D points
  • 14. #denatechcon What can you do with SfM? https://grail.cs.washington.edu/rome/rome_paper.pdf It built a 3D representation of Rome within a day with images found on the web. It used 150k images, and the processing time was around 21 hours using 496 CPU cores.
  • 15. #denatechcon Let’s sum that up • SfM can reconstruct 3D shapes from 2D images • 3D representation of Rome can be built in a day using images from the web
  • 16. #denatechcon So we have tools. What now? ● Dashcam images are used for reconstructing 3D points by SfM ● The same images are used for detecting objects in 2D space ● Both results are integrated to get 3D representations of each object
  • 17. #denatechcon 3D Point Reconstruction ● Images are taken by driving in the highlighted region in Minatomirai ● Dashcam images are used for SfM and object detection
  • 18. #denatechcon Overall shape looks good ● a ● b ● c ● 3D modeling in relatively small region in Minatomirai ● Reconstructed shape matches the highlighted region in the map
  • 19. #denatechcon Slightly larger region, still good ● Red arrows indicate the direction the car was driving ● The reconstructed shape matches the highlighted region in the map
  • 20. #denatechcon Hooray, view from top is good ● SfM was applied in a larger region in the Minatomirai area ● Overall shape still matches the map
  • 21. #denatechcon What about the closer view? The detail of road markings and speed limit signs can be found, though some information is unnecessary Lanes are reconstructed well on the left side, but the the center lane markings on the right are missing. This is caused by the divider
  • 22. #denatechcon Some findings with SfM are: • Reconstructed 3D points contain small details • GPU can reduce the processing time significantly • The more images, the better the result
  • 23. #denatechcon Recognizing Objects ● We chose Faster R-CNN for detecting traffic signs ● Faster R-CNN was a state-of-the-art detector in 2016 ● Faster R-CNN is a really accurate object detector when compared to other real-time detectors, but it’s slower https://arxiv.org/abs/1506.01497
  • 24. #denatechcon Objects are detected correctly ● Most of traffic signs are detected correctly, though there is a small traffic sign missed by the detector ● The network predicts the category for each box, and there are more than 100 categories to choose from
  • 26. #denatechcon What now for lane detection? https://arxiv.org/abs/1802.05591 ● We chose LaneNet published in 2018 as a lane detector ● LaneNet transforms an original image to a bird’s eye image with learned parameters ● It can detect multiple lane instances at real-time speed and high accuracy
  • 27. #denatechcon Deep learning can detect lanes! ● Different colors indicate different instances ● You can see that the lanes are detected correctly ● It can detect curved lanes as well, though they aren’t in the image
  • 29. #denatechcon What about road markings? Bird’s eye transformation on original image Inverse transformation on bird’s eye image Faster R-CNN on bird’s eye image
  • 30. #denatechcon Deep learning works for road markings! ● Road markings are detected correctly. ● It distinguishes the lane from the stop sign ● The detection fits objects, though not perfectly
  • 34. #denatechcon Let’s sum that up • Traffic sign recognition with more than 100 categories can be solved with deep learning • Deep learning works well on complicated tasks such as lane and road marking detection • The more data, the better the results
  • 35. #denatechcon Putting It All Together ● Green points indicate the region used for 3D reconstruction ● The detection has to be done in frames where the objects are highlighted in green
  • 36. #denatechcon Results are now integrated We can get a 3D representation of detected objects by integrating both results. The final result will look like image above.
  • 37. #denatechcon Now, objects are represented in 3D ● Detected traffic signs and road markings are converted to 3D ● Each object has a 3D representation after integrating both SfM and object detection results
  • 38. #denatechcon We are done! ● Reconstructed 3D view looking from top ● You can see the detected lanes and road markings now have a 3D representation
  • 39. #denatechcon Using this technique, we could do: • Automating process for map creation • Creating HD maps for other services • Detecting changes automatically