SlideShare a Scribd company logo
1 of 28
3D Pose Estimation for
Transparent Objects
Presenter: 賴柏任
Advisor:羅仁權教授
05.20.2015
Motivation
• Transparent objects are everywhere
• If we know he pose, we can grasp it!
2
Problems
3
Color of
transparent
object changes
Hard to locate
transparent
objects
Edge of
transparent objects
are blur
Hard to estimate
pose of
transparent objects
Effective cure
4
Color of
transparent
object changes
Edge of
transparent
objects are blur
Kinect v.s. Color changes
• Transparent objects produce NaN in
depth map
5
Ref: I. Lysenkov and V. Rabaud, "Pose estimation of rigid transparent objects in
transparent clutter," in Robotics and Automation (ICRA), 2013 IEEE International
Conference on, 2013, pp. 162-169.
Graphcut v.s. Blur edge
• Given foreground & background clue
6
Ref: C. Rother, V. Kolmogorov, and A. Blake, "Grabcut: Interactive foreground
extraction using iterated graph cuts," ACM Transactions on Graphics (TOG), vol.
23, pp. 309-314, 2004.
Graphcut v.s. Blur edge
• Generate the prob. distribution
7
Ref: C. Rother, V. Kolmogorov, and A. Blake, "Grabcut: Interactive foreground
extraction using iterated graph cuts," ACM Transactions on Graphics (TOG), vol.
23, pp. 309-314, 2004.
Graphcut v.s. Blur edge
• Use distance to compensate
8
Ref: C. Rother, V. Kolmogorov, and A. Blake, "Grabcut: Interactive foreground
extraction using iterated graph cuts," ACM Transactions on Graphics (TOG), vol.
23, pp. 309-314, 2004.
Graphcut v.s. Blur edge
• OpenCV implementation
9
A coarse pipeline
10
Detect NaN
area in
depth map
Feed the
area to
Graphcut
Segment
the edge
How to determine pose?
• Model-based matching
• Rotate in x & y axis and store the edge
11
Z-axis Y-axis
The problem becomes a 2D-
2D matching problem
Where is the model?
• Kinect Fusion
12
Where is the model?
Wrap your
object with
paper
Use Kinect
Fusion to
construct the
model
Store the model
13
What if there are some other NaN
objects?
• Some non-transparent objects also
produce NaN in depth map
14
What if there are some other NaN
objects?
• Use characteristics of transparent object
to rule out non-transparent objects
15
Transparent
objects produce
highlights
Color of transparent
object is similar to
peripheral area
What if there are some other NaN
objects?
• Transparent objects produce highlights
16
Ref: K. McHenry, J. Ponce, and D. Forsyth, "Finding glass," in Computer Vision
and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference
on, 2005, pp. 973-979.
What if there are some other NaN
objects?
• Transparent objects produce highlights
17
Ref: K. McHenry, J. Ponce, and D. Forsyth, "Finding glass," in Computer Vision
and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference
on, 2005, pp. 973-979.
Threshold the image from 0-255
Compute the perimeter in each image
Compute the threshold by line fitting (from
255 to 0)
What if there are some other NaN
objects?
• Color of transparent object is similar to
peripheral area
18
What if there are some other NaN
objects?
• Color of transparent object is similar to
peripheral area
19
Hue histogram
A fine pipeline
20
Some results
• Pose Matching
21
Some results
• Total retrieved candidates are over 200
22
Method Recall Precision
Only NaN 86.11% 38.24%
Characteristics 86.11% 93.93%
Recall = (2/2)*100% =100%
Precision=(2/5)*100% =40%
Some other problems
• How to let robot grasp?
• Is there any choice other from Kinect?
23
How to let robot grasp?
• Teach and Play
24
Grasp
points
Is there any choice other from
Kinect?
• Extract the visual word of transparent
objects
25
Is there any choice other from
Kinect?
26
Ref: M. Fritz, G. Bradski, S. Karayev, T. Darrell, and M. J. Black, "An additive latent
feature model for transparent object recognition," in Advances in Neural
Information Processing Systems, 2009, pp. 558-566.
Is there any choice other from
Kinect?
• The result can be the input of Graphcut
27
Ref: M. Fritz, G. Bradski, S. Karayev, T. Darrell, and M. J. Black, "An additive latent
feature model for transparent object recognition," in Advances in Neural
Information Processing Systems, 2009, pp. 558-566.
28
Thank you!

More Related Content

Similar to Seminar報告_20150520

NIPS2009: Understand Visual Scenes - Part 2
NIPS2009: Understand Visual Scenes - Part 2NIPS2009: Understand Visual Scenes - Part 2
NIPS2009: Understand Visual Scenes - Part 2
zukun
 
ICCES 2017 - Crowd Density Estimation Method using Regression Analysis
ICCES 2017 - Crowd Density Estimation Method using Regression AnalysisICCES 2017 - Crowd Density Estimation Method using Regression Analysis
ICCES 2017 - Crowd Density Estimation Method using Regression Analysis
Ahmed Gad
 

Similar to Seminar報告_20150520 (20)

Point Cloud Processing: Estimating Normal Vectors and Curvature Indicators us...
Point Cloud Processing: Estimating Normal Vectors and Curvature Indicators us...Point Cloud Processing: Estimating Normal Vectors and Curvature Indicators us...
Point Cloud Processing: Estimating Normal Vectors and Curvature Indicators us...
 
NIPS2009: Understand Visual Scenes - Part 2
NIPS2009: Understand Visual Scenes - Part 2NIPS2009: Understand Visual Scenes - Part 2
NIPS2009: Understand Visual Scenes - Part 2
 
Introduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingIntroduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable Rendering
 
Visual geometry with deep learning
Visual geometry with deep learningVisual geometry with deep learning
Visual geometry with deep learning
 
Video Analysis with Convolutional Neural Networks (Master Computer Vision Bar...
Video Analysis with Convolutional Neural Networks (Master Computer Vision Bar...Video Analysis with Convolutional Neural Networks (Master Computer Vision Bar...
Video Analysis with Convolutional Neural Networks (Master Computer Vision Bar...
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...
 
Deep Video Object Tracking - Xavier Giro - UPC Barcelona 2019
Deep Video Object Tracking - Xavier Giro - UPC Barcelona 2019Deep Video Object Tracking - Xavier Giro - UPC Barcelona 2019
Deep Video Object Tracking - Xavier Giro - UPC Barcelona 2019
 
Learning with Unpaired Data
Learning with Unpaired DataLearning with Unpaired Data
Learning with Unpaired Data
 
AR/SLAM and IoT
AR/SLAM and IoTAR/SLAM and IoT
AR/SLAM and IoT
 
lecture_16_jiajun.pdf
lecture_16_jiajun.pdflecture_16_jiajun.pdf
lecture_16_jiajun.pdf
 
ICCES 2017 - Crowd Density Estimation Method using Regression Analysis
ICCES 2017 - Crowd Density Estimation Method using Regression AnalysisICCES 2017 - Crowd Density Estimation Method using Regression Analysis
ICCES 2017 - Crowd Density Estimation Method using Regression Analysis
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detection
 
Visual Transformers
Visual TransformersVisual Transformers
Visual Transformers
 
Government GraphSummit: Leveraging Graphs for AI and ML
Government GraphSummit: Leveraging Graphs for AI and MLGovernment GraphSummit: Leveraging Graphs for AI and ML
Government GraphSummit: Leveraging Graphs for AI and ML
 
Deep Learning for Computer Vision (3/4): Video Analytics @ laSalle 2016
Deep Learning for Computer Vision (3/4): Video Analytics @ laSalle 2016Deep Learning for Computer Vision (3/4): Video Analytics @ laSalle 2016
Deep Learning for Computer Vision (3/4): Video Analytics @ laSalle 2016
 
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
 
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in Vision
 
Practical computer vision-- A problem-driven approach towards learning CV/ML/DL
Practical computer vision-- A problem-driven approach towards learning CV/ML/DLPractical computer vision-- A problem-driven approach towards learning CV/ML/DL
Practical computer vision-- A problem-driven approach towards learning CV/ML/DL
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Seminar報告_20150520

  • 1. 3D Pose Estimation for Transparent Objects Presenter: 賴柏任 Advisor:羅仁權教授 05.20.2015
  • 2. Motivation • Transparent objects are everywhere • If we know he pose, we can grasp it! 2
  • 3. Problems 3 Color of transparent object changes Hard to locate transparent objects Edge of transparent objects are blur Hard to estimate pose of transparent objects
  • 4. Effective cure 4 Color of transparent object changes Edge of transparent objects are blur
  • 5. Kinect v.s. Color changes • Transparent objects produce NaN in depth map 5 Ref: I. Lysenkov and V. Rabaud, "Pose estimation of rigid transparent objects in transparent clutter," in Robotics and Automation (ICRA), 2013 IEEE International Conference on, 2013, pp. 162-169.
  • 6. Graphcut v.s. Blur edge • Given foreground & background clue 6 Ref: C. Rother, V. Kolmogorov, and A. Blake, "Grabcut: Interactive foreground extraction using iterated graph cuts," ACM Transactions on Graphics (TOG), vol. 23, pp. 309-314, 2004.
  • 7. Graphcut v.s. Blur edge • Generate the prob. distribution 7 Ref: C. Rother, V. Kolmogorov, and A. Blake, "Grabcut: Interactive foreground extraction using iterated graph cuts," ACM Transactions on Graphics (TOG), vol. 23, pp. 309-314, 2004.
  • 8. Graphcut v.s. Blur edge • Use distance to compensate 8 Ref: C. Rother, V. Kolmogorov, and A. Blake, "Grabcut: Interactive foreground extraction using iterated graph cuts," ACM Transactions on Graphics (TOG), vol. 23, pp. 309-314, 2004.
  • 9. Graphcut v.s. Blur edge • OpenCV implementation 9
  • 10. A coarse pipeline 10 Detect NaN area in depth map Feed the area to Graphcut Segment the edge
  • 11. How to determine pose? • Model-based matching • Rotate in x & y axis and store the edge 11 Z-axis Y-axis The problem becomes a 2D- 2D matching problem
  • 12. Where is the model? • Kinect Fusion 12
  • 13. Where is the model? Wrap your object with paper Use Kinect Fusion to construct the model Store the model 13
  • 14. What if there are some other NaN objects? • Some non-transparent objects also produce NaN in depth map 14
  • 15. What if there are some other NaN objects? • Use characteristics of transparent object to rule out non-transparent objects 15 Transparent objects produce highlights Color of transparent object is similar to peripheral area
  • 16. What if there are some other NaN objects? • Transparent objects produce highlights 16 Ref: K. McHenry, J. Ponce, and D. Forsyth, "Finding glass," in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, 2005, pp. 973-979.
  • 17. What if there are some other NaN objects? • Transparent objects produce highlights 17 Ref: K. McHenry, J. Ponce, and D. Forsyth, "Finding glass," in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, 2005, pp. 973-979. Threshold the image from 0-255 Compute the perimeter in each image Compute the threshold by line fitting (from 255 to 0)
  • 18. What if there are some other NaN objects? • Color of transparent object is similar to peripheral area 18
  • 19. What if there are some other NaN objects? • Color of transparent object is similar to peripheral area 19 Hue histogram
  • 21. Some results • Pose Matching 21
  • 22. Some results • Total retrieved candidates are over 200 22 Method Recall Precision Only NaN 86.11% 38.24% Characteristics 86.11% 93.93% Recall = (2/2)*100% =100% Precision=(2/5)*100% =40%
  • 23. Some other problems • How to let robot grasp? • Is there any choice other from Kinect? 23
  • 24. How to let robot grasp? • Teach and Play 24 Grasp points
  • 25. Is there any choice other from Kinect? • Extract the visual word of transparent objects 25
  • 26. Is there any choice other from Kinect? 26 Ref: M. Fritz, G. Bradski, S. Karayev, T. Darrell, and M. J. Black, "An additive latent feature model for transparent object recognition," in Advances in Neural Information Processing Systems, 2009, pp. 558-566.
  • 27. Is there any choice other from Kinect? • The result can be the input of Graphcut 27 Ref: M. Fritz, G. Bradski, S. Karayev, T. Darrell, and M. J. Black, "An additive latent feature model for transparent object recognition," in Advances in Neural Information Processing Systems, 2009, pp. 558-566.