2. • Dong-Won Shin
• Gwangju Institute of Science and Technology
• PhD candidate in Computer Science Field
• SLAM Research Group KR manager
• (blog) dongwonshin.net
• (github) github.com/JustWon
• (E-mail) dongwonshin@gist.ac.kr
Speaker
2
3. • Introduction to SLAM
• Traditional SLAM approaches
• Recent SLAM approaches
• Exercise
Contents
3
5. • What is SLAM?
• Computational problem of constructing a map of an environment
while simultaneously keeping track of a robot’s location
• Application
Simultaneous Localization and Mapping
Augmented reality Virtual reality Robotics
https://en.wikipedia.org/wiki/Simultaneous_localization_and_mapping
5
6. • Visual localization
• Under the inaccurate GPS
• GPS-denied environment
• Ex)
• Mapping
• Scenarios in which a prior map is not available and needs to be built.
• Map can inform path planning or provide an intuitive visualization for a human or robot.
• Ex)
Simultaneous Localization and Mapping
6
C. Cadena et al., “Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age,” IEEE Trans. Robot., vol. 32, no.
6, pp. 1309–1332, 2016.
Indoor environment Skyscraper
Disaster areaPrivate room
7. • Bayesian Filtering based SLAM
• Theoretical fundamental and prototype of traditional
Bayesian filtering based SLAM framework emerged in 1900s.
• Ex) EKF SLAM, FastSLAM
• Visual odometry
• The process of estimating the ego-motion of a robot using
only the input of a single or multiple cameras attached to it.
• Ex) stereo VO, monocular VO
• Structure from motion
• Investigating the problem of recovering relative camera poses
and 3D structure from a set of camera images
• Off-line version of visual SLAM
Earlier Inspirations
7
8. • The solution to the large scale map management
• 1) graph based slam and loop closure detection
• 2) efficient map representation and refinement: sparse, dense, and semi-dense
• Graph based SLAM
• constructing a graph whose nodes represent robot poses or landmarks
• edge between nodes encodes a sensor measurement that constrains connected poses
• Loop closure detection
• Detecting loop closures in a map to give additional constraint for the consistent mapping
Effort towards Large Scale Mapping
8
10. • Frontend
• 인접한 이미지 사이의 카메라 움직임
• 카메라의 동작에 대한 정량적인 측정이 필요
• 회전 및 이동
• 카메라와 3차원 공간 점 사이의 기하학적 관계를 이해
• Drift error
• Visual odometry만을 이용하여 궤적을 추정하면 에러가 누적됨
• 해결책
• Backend optimization과 loop closure detection의 필요
Visual Odometry
10
11. • Sensor noise
• 정밀한 센서라도 노이즈는 있기 마련
• 저렴한 센서는 측정 오류가 큼
• 일부 센서는 자기장과 온도에 영향을 받음
• Backend 최적화
• 노이즈가 많은 데이터로부터 전체 시스템의 상태를 추정하는 방법 (state estimation)
• Frontend에서 발생한 Drift 에러를 제거
• Frontend에서는 backend에 최적화 할 데이터와 데이터의 초기 값을 제공
• Backend 최적화의 종류
• 필터 기반 (Kalman filter, particle filter)
• 비선형 최적화 기반 (bundle adjustment, pose graph optimization)
Backend Optimization
11
12. • 현재 위치가 이전에 방문한 곳인지를 판단
• QR 코드 활용
• 환경적인 제약이 존재
• 카메라 영상을 활용
• 이미지 간의 유사성을 판단
• Backend에서는 루프 폐쇄에 대한 정보를 받아서 위치 추정과 지도 작성에 반영
• Drift 에러를 제거하고 전역적으로 일관된 맵을 생성
Loop Closure Detection
12 T. Sattler et al., “Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions,” 2017.
13. • 환경 맵을 만드는 과정
• 응용 프로그램에 따라 다르게 구현
• 무인 청소 로봇
• 2차원 지도로 충분
• 자율주행 드론
• 6DOF이므로 3차원 지도가 필요
• Map representation
• Pointcloud
• Voxel
• Surfel
• Occupancy grid map
• …
• 맵 표현 밀도에 따른 종류
• Sparse, dense, semi-dense map
Mapping
13
14. • Sparse SLAM
• Only use a small selected subset of the pixels (features) from a monocular color camera
• Fast and real time on CPU but it produces a sparse map (point clouds)
• Landmark-based or feature-based representations
• ORB SLAM
• Complete SLAM system for monocular camera
• Real-time on standard CPUs in a wide variety of environments
• small hand-held indoors
• drones flying in industrial environments
• cars driving around a city
Modern State of the Art Systems
14
15. • Dense SLAM
• Use most or all of the pixels in each received frame
• Or use depth images from a depth camera
• It produces a dense map but GPU acceleration is necessary for the real-time operation.
• Volumetric model or surfel-based representations
• KinectFusion
• Modelling of natural scenes with only commodity sensor and GPU
• Real-time markerless tracking
• Dense surface reconstruction
Modern State of the Art Systems
15
16. • Lidar SLAM
• Make use of the Lidar sensor input for the localization and mapping
• Autonomous driving purpose-oriented in outdoor environment
• LOAM
• Very low drift error using the edge and planar features
• Low computation complexity
Modern State of the Art Systems
16
17. Investments in SLAM Field
17
SLAMcore Scape Technology
https://www.mmcventures.com/mmc-ventures-backs-slamcore-in-5-million-seed-funding-2/
https://techcrunch.com/2019/01/08/scape/
20. • Feature-based monocular SLAM system
• operates in real time, in small and large, indoor and outdoor environments
• URL: https://github.com/raulmur/ORB_SLAM2
• Main components
• Tracking
• Local mapping
• Loop closing
• Data structures
• Map points
• Keyframes
• Covisibility graph
• Essential graph
ORB SLAM
20
Map points & keyframes Covisibility graph Essential graph
R. Mur-Artal, J. M. M. Montiel, and J. D. Tardos, “ORB-SLAM: A Versatile and Accurate Monocular SLAM System,” IEEE Trans. Robot., vol. 31, no. 5, pp. 1147–1
163, 2015.
21. • Map points 𝑝"
• 3D position 𝐗$,"
• Viewing direction 𝐧"
• Representative ORB descriptor 𝐃"
• Keyframes 𝐾"
• Camera pose 𝑻"$
• Camera intrinsics (focal length and principal point)
• All the ORB features extracted in the frame
Data structures
21
Map points & keyframes
22. • Covisibility graph
• Undirected weighted graph
• Node: keyframe
• Edge: if two keyframes share observations of
the same map points (at least 15)
• Weight 𝜃: the number of common map points
• Essential graph
• Retain all the nodes but less edges
• Subset of edges from the covisibility graph with high covisibility
+ loop closure edges
Data structures
22
Covisibility graph
Essential graph
23. • ORB feature extraction
• For tracking, mapping, and place recognition tasks
• Robust to rotation and scale
• Good invariance to camera auto-gain and auto-exposure, and illumination changes
• Fast to extract and match allowing for real-time operation
• Show good precision/recall performances in bag-of-word place recognition
Tracking
23
E. Rublee, V. Rabaud, K. Konolige, and G. R. Bradski, “ORB: An efficient alternative to SIFT or SURF.,” ICCV, pp. 2564–2571, Jan. 2011.
Example of orb features
24. • Initial pose estimation
• Case 1: if tracking was successful for the last frame
• constant velocity motion model
• Bundle adjustment in two view case
Tracking
24
Minimizing the reprojection error
https://cs.nyu.edu/~fergus/teaching/vision/11_12_multiview.pdf
t1
t2
t3
25. • Initial pose estimation
• Case 2: if the tracking is lost
• global relocalization
• Bag-of-visual-words
• Converting an image into a representative descriptor
• Perspective-N-Point problem
• Finding 2D—3D correspondences
Tracking
25
26. • New keyframe decision
• (condition 1) Good relocalization
• more than 20 frames must have passed from the last global relocalization
• (condition 2) Last keyframe insertion
• more than 20 frames have passed from the last keyframe insertion
• (condition 3) Good tracking
• Current frame tracks at least 50 points
Tracking
26
Global
Relocalization
New
Keyframe... … …
20 frames
Last
Keyframe
New
Keyframe... … …
20 frames
Last
frame
New
Keyframe
…
…
At least 50 points
27. • Keyframe insertion
• Compute the bags of words representation
• Update the covisibility graph
• New map point creation
• Finding the matches by the epipolar geometry
• Initial map point creation via triangulation
• Consistency check
• Parallax
• Positive depth in both cameras
• Reprojection error
Local Mapping
27
Feature vectorKeyframe
Covisibility graph
28. • Local keyframe culling
• To maintain a compact reconstruction, detect redundant keyframes and delete them
• Redundency check
• Keyframe whose 90% of the map points have
been seen in at least other three keyframes
• Recent map point culling
• To retain the compact map
• Association check
•
• Local bundle adjustment
• Local map points
• K" : Currently processed keyframe
• K- : All the keyframes connected to it in the covisibility graph
• All the map points seen by those keyframes
Local Mapping
28
(# frames the matching is found)
(# frames the map point is visible)
> 0.25
matching
found
matching
not found
29. • Loop candidates detection
• Compute the similarity between the bag of words vector
• Current keyframe and all its neighbors in the covisibility graph
• Query the recognition DB
• Compute the similarity transformation
• 7 DOF (3 rotations, 3 translations, scale)
• Scale drift over time in a monocular SLAM
• Similarity transform between current keyframe and the loop closure candidate
Loop Closing
29
keyframe1
keyframe2
Keyframe N
……
Keyframe DB
current
keyframe
query
https://www.youtube.com/watch?v=v9lGUGRehkM
30. • Loop fusion
• When the loop closure detection is triggered, the loop fusion is performed.
• Keyframe and map point correction
• Insert loop edges in the covisibility graph
• Current keyframe pose is corrected with the similarity transformation that we obtained
• Fuse duplicated map points
• Essential graph optimization
• For all keyframes and all map points
Loop Closing
30
Essential graph
33. KinectFusion
• Dense RGB-D SLAM system
• Only the depth data from Kinect is used
• to track the 3D pose of the sensor
• to reconstruct 3D models of the physical scene in real-time
• GPU-based pipeline
• Application
• Core system for low-cost handheld scanning
• Geometry-aware augmented reality
• Physics-based interactions
• Real-time multi-touch interactions anywhere
33
34. • Depth to vertex
• Transform each pixel of a new depth image into a 3D point (vertex)
• 𝑣" 𝑢 = 𝐷" 𝑢 𝐾78
[𝑢, 1]
• Vertex to normal
• Compute the normal vectors for each vertex of a point cloud
• Normals are used in the projective data association step of the ICP algorithm
• 𝑛" 𝑢 = (𝑣" 𝑥 + 1, 𝑦 − 𝑣" 𝑥, 𝑦 )×(𝑣" 𝑥, 𝑦 + 1 − 𝑣" 𝑥, 𝑦 )
Depth Map Conversion
34
35. Camera Tracking
35
• Iterative Closest Points (ICP)
• Widely used for geometric alignment of three-dimensional models
• Start with two meshes and an initial guess for their relative rigid-body transform
• Refine the transform by repeatedly generating pairs of corresponding points
• Methods
• Point to Point
• Point to Plane
• …
36. • Minimize a perpendicular distance from the source point to tangent plane of
destination point
• Nonlinear least square algorithm using Levenberg-Marquardt method
Point-to-Plane ICP
36
K. Low, “Linear Least-squares Optimization for Point-to-plane ICP Surface Registration,” Chapel Hill, Univ. North Carolina, no. February, pp. 2–4, 2004.
𝑠" = (𝑠"E, 𝑠"F, 𝑠"G, 1)H
𝑑" = (𝑑"E, 𝑑"F, 𝑑"G, 1)H
𝑛" = (𝑛"E, 𝑛"F, 𝑛"G, 0)H
Source point s
Destination point d
Unit normal vector at d
37. • Voxel
• A value on a regular grid in 3D space, the extended concept of 2D pixel
• Widely used for realistic rendering 3D object in computer graphics
• Data
• Truncated signed distance function (TSDF) value
• Truncated signed distance function (TSDF) weight
Volumetric Representation
37
38. • Truncated signed distance function (TSDF)
• Predefined 3D volume is subdivided uniformly into a 3D grid of voxels
• These values are positive in-front of the surface and negative behind
• Zero-crossing point means the surface of the object
Volumetric Representation
38
39. • TSDF integration
• If a voxel is behind the surface observed in the new depth image,
the image does not contain any new information about it, and the function returns.
• If the voxel is close to or in front of the observed surface,
a corresponding observation is added to the accumulated sum.
Volumetric Integration
39 S. Izadi, D. Kim, O. Hilliges, and … D. M., “KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera,” in Proceedings of the 24th …, 2
011.
40. • Motivation
• Depth image is computed from the updated 3D world model given a camera pose
• Input to the tracking step in the next frame and also for visualization.
• Raycasting
• A ray is being cast from the camera up until an intersection with the surface is found
• Checking the value of the TSDF at each voxel along the ray until a zero-crossing is found
Raycasting
40
43. • Low-drift and real-time lidar odometry and mapping
• Github: https://github.com/laboshinl/loam_velodyne
• Main components
• LiDAR odometry
• Compute the motion of the Lidar between two consecutive sweeps at a frequency of 10Hz
• LiDAR mapping
• Matches and registers the local cloud onto a map at a frequency of 1Hz
LOAM
43
J. Zhang and S. Singh, “Low-drift and real-time lidar odometry and mapping,” Auton. Robots, vol. 41, no. 2, pp. 401–416, 2017.
44. • Feature point extraction
• Edge features
• Planar features
• Curvature
• Smoothness of the local surface
• Sort the points based on the c values
• Classification
• Planar features : from the minimum c
• Edge features: from the maximum c
LiDAR Odometry
44
Sorted points
Edge features
Planar features
maximum
minimum
45. • Outlier rejection
• (a) Points on local planar surfaces that are roughly parallel to the laser beams
• its neighborhood form a surface patch whose normal is within 10 degree to the laser beam
• (b) Points that are on boundary of occluded regions
• its neighborhood disconnected from the point by a gap in the direction of the laser beam
LiDAR Odometry
45
46. • Finding feature point correspondence
• For edge features in the current point clouds,
• Find edge lines in the previous point clouds
• Point-to-edge distance
• For planar features in the current point clouds,
• Find planar patches in the previous point clouds
• Point-to-plane distance
LiDAR Odometry
46
47. • Motion estimation
• Compute residuals according to
• point-to-edge distance
• point-to-plane distance
• Establish Jacobian matrix J
• Levenberg-Marquardt method
• Bisquare weight
LiDAR Odometry
47
, where
Least square
48. • Pseudo code
LiDAR Odometry
48
For edge feature
For planar feature
Optimization
Feature classification
49. • Mapping current local point cloud to the map
• Store the point cloud on the map in 10 m cubic area at a current position
• Extract feature points in the same way (edge and planar features)
• Find correspondences
• S’: neighboring points in a certain region around the feature points
• Covariance matrix M
• Eigendecomposition
• Edge line: one eigenvalue is significantly larger than the other two
• Planar patch: two large eigenvalues with the third one significantly smaller
LiDAR Mapping
49
without
Lidar mapping
with
Lidar mapping
50. • Visual-lidar Odometry and Mapping: Low-drift, Robust, and Fast
• Authors: Ji Zhang and Sanjiv Singh
• Main components
• Visual odometry
• At high frame-rate (60hz) to estimate a motion
• Lidar odometry
• At low frequency (1hz) to refine motion estimate
V-LOAM
50
51. • Depth map registration
• World map points are projected to the color camera view.
• Rotation and translation of previous frame is necessary.
• Only points in front of the camera are kept,
point that a received a certain time ago are forgotten.
• Projected depth map
• Visual feature associated with the distance
Visual Odometry
51
color image point cloud & visual features
52. • Frame to Frame motion estimation
• Known distance case
• Unknown distance case
• Levenberg-Marquardt method
Visual Odometry
52
* In this case, only the depth map at
(k-1) frame is always available.
53. • Sweep to Sweep refinement
• Extract geometric features like edge points and planar points
• Compute curvature in local scans
• Outlier rejection
• Find correspondences
• Motion estimation with the nonlinear optimization
Lidar Odometry
53
54. • Sweep to Map registration
• Add undistorted point cloud to the map
• Find correspondences
• Examining distributions of local point clusters
• Edge segment: one large and two small eigenvalues
• Planar patch: two large and one small eigenvalues
Lidar Odometry
54
59. • maplab: An Open Framework for Research in Visual-inertial Mapping and Localization
• Main contribution
• General purpose visual-inertial mapping framework with multi-session support
• Robust visual-inertial estimator tightly coupled with a localization system
• Extensibility of the system that makes it well suited for research
• Main components
• ROVIOLI: VIO and localization frontend
• Maplab-console
MapLab
59
60. • Robust visual-inertial odometry with localization integration
• Mapping and localization frontend
• Build maps from raw visual and inertial data
• Localize with respect to existing maps online
• Feature tracking module
• Track BRISK or FREAK keypoints
• Correspondences by matching descriptors from frame to frame
• Frame localization
• 2d-3d correspondences against the provided localization map
• P3P algorithm within a RANSAC scheme
• ROVIO
• Map builder
ROVIOLI
60
61. • Console user interface to manipulate maps
• Multiple maps can be loaded into the console simultaneously.
• All algorithms are available through console commands.
• Parameters specific to each algorithm are set by console flags
• Real-time visualization of the map in Rviz
• It is possible to combine multiple algorithms and experiment with entire pipelines.
• Changes can be easily reverted by saving and reloading intermediate states
MapLab Console
61
https://github.com/ethz-asl/maplab/wiki
62. • VI-map: Visual-inertial mapping data
• Raw measurements of all sensors
• Sparse reconstruction of the covered environment
• Mission (submap)
• Vertex: state captured at a certain point in time
• Edge: connects two neighboring vertices
• IMU edge
• Loop-closure edge
Map Structure
62
63. • For many robotic applications
• Very important to have access to drift-free global pose estimates
• Example
• Robotic manipulation
• precise navigation
• Teach & repeat scenarios
• ROVIOLI to create an initial VI-map
• Sensor data from
• Offline in a Rosbag
• Online using ROS topics
• VI-map is automatically
• loop-closed
• Optimized
• Keyframed
• Summarized
Online Mapping
63
64. • In many mapping applications,
• It is not possible to cover the entire environment within a single mapping session
• Maplab offers tools to co-register maps from multiple sessions together
Multi-session Mapping
64
65. • Large-scale, multi-session map of the old town of Zurich
• Raw visual-inertial data recorded in 45 sessions on two different days
• Initial open loop maps on Google Tango tablets with ROVIOLI
• Global consistent map on a desktop with 32 GB RAM
• Total duration: 231 min
• Total length: 16.48 km
• Map size: 480 MB
Large-Scale Mapping
65
66. • Dense 3d representation of the environment is required.
• path planning
• inspection and object detection
• Collision, obstacle avoidance
• Stereo dense reconstruction
• semi-global stereo matching
• TSDF-based depth fusion
• Voxblox (volumetric mapping library)
Dense Reconstruction
66
67. • Most of the basic tools needed in visual-inertial mapping and localization
• A rich set of helper functions, queries, and manipulation tools
• Easy integration of new algorithms into the system
• Extensive examples demonstrating how to extend the framework
• Related researches
Using MapLab for Research
67
Topomap: Topological Mapping and
Navigation Based on Visual SLAM Map
Collaborative Navigation
for Flying and Walking Robots
Real-time visual-inertial mapping,
re-localization and planning onboard
mavs in unknown environments
Appearance-based landmark selection
for efficient long-term visual localization
Keep it brief: Scalable creation of
compressed localization maps
The gist of maps-summarizing
experience for lifelong localization
Real-time visual-inertial localization
for aerial and ground robots
Map API-scalable decentralized
map building for robots
https://github.com/ethz-asl/maplab/wiki/Related-Research
69. • Real-time, large-scale depth fusion and tracking framework
• URL: https://github.com/victorprad/InfiniTAM
• Main components
• Tracking
• Allocation
• Integration
• Raycasting
• Relocalisation & Loop Closure Detection
• Data structure
• Volumetric representation
• Voxel block hashing
InfiniTam
69
V. A. Prisacariu et al., “InfiniTAM v3: A Framework for Large-Scale 3D Reconstruction with Loop Closure,” 2017.
70. • Volumetric representation using a hash lookup
• Data structures and operations
• Voxel
• Voxel block hash
• Hash table and hashing function
• Hash table operations
• Voxel
• A value on a regular grid in 3D space, the extended concept of 2D pixel
• Widely used for realistic rendering 3D object in computer graphics
• Data
• TSDF value
• TSDF Weight
• Color value
• Color weight
Volumetric Representation
70
71. • Voxel block array
• Majority of data stored in the regular voxel grid is marked either as
• Free space
• Unobserved space
• Only store the surface data by efficient hashing scheme
• Grouped voxels in blocks of predefined size (ex. 8x8x8)
• Data
• Positions of the corner of the 8x8x8 voxel block
• Offset in the excess list
• Pointer to the voxel block array
Volumetric Representation
71
M. Nießner, M. Zollhöfer, S. Izadi, and M. Stamminger, “Real-time 3D reconstruction at scale using voxel hashing,” ACM Trans. Graph., vol. 32, no. 6, pp. 1–11,
2013.
72. • Hash table
• To quickly and efficiently find the position of a certain voxel block in the voxel block array
• Hashing function
• For locating entries of the hash table takes the corner coordinates of a 3D voxel block
• Hash collision case
• Use the additional unordered excess list
• Store an offset in the voxel block array
Volumetric Representation
72
73. • To determine the pose of a new camera frame given the 3D world model
• Diverse methods
• Using only the depth
• Inspired by Point-to-Plane ICP
• Using only the color
• Inspired by the direct method
• Using both data
• Utilize both approaches
• Main differences of the extended tracker
• Huber-norm instead of the standard L2 norm
• Error term weighted by its depth measurement
Tracking
73
Least square Huber
74. • Three main stages in allocation
• 1) backproject a line connecting 𝑑 − 𝜇 to 𝑑 + 𝜇
• 𝑑 : depth in image coordinates
• 𝜇: a fixed, tunable parameter
• This leads to a line in world coordinates, which intersects a number of voxel blocks.
• Search the hash table for each of these blocks and look for a free hash entry
• 2) allocate voxel blocks for each non zero entry in the allocation and visibility arrays
• 3) build a list of live hash entries
Allocation
74
d+μ
d
d-μ
virtual voxel
block grid
75. • TSDF integration
• If a voxel is behind the surface observed in the new depth image,
the image does not contain any new information about it, and the function returns.
• If the voxel is close to or in front of the observed surface,
a corresponding observation is added to the accumulated sum.
Integration
75 S. Izadi, D. Kim, O. Hilliges, and … D. M., “KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera,” in Proceedings of the 24th …, 2
011.
76. • Motivation
• Depth image is computed from the updated 3D world model given a camera pose
• Input to the tracking step in the next frame and also for visualization.
• Raycasting
• A ray is being cast from the camera up until an intersection with the surface is found
• Checking the value of the TSDF at each voxel along the ray until a zero-crossing is found
• State machine to efficiently handle the sparse volumetric space
• SEARCH_BLOCK_COARSE
• SEARCH_BLOCK_FINE
• SEARCH_SURFACE
• BEHIND_SURFACE
Raycasting
76
77. • Keyframe-based random ferns relocaliser
• To relocalise the camera when tracking fails
• To detect loop closures when aiming to construct a globally-consistent scene
• Procedure
• Downsample and preprocess image
• Each of m code blocks is obtained by applying a random fern to I
• Fern is a set of n binary feature tests on the image, each yielding either 0 or 1
• 𝑏LM
N
∈ 𝐵Q
denote the n-bit binary code resulting from applying fern 𝐹S to I
• 𝑏T
N
∈ 𝐵UQ
denote the result of concatenating all m such binary codes for I
• Dissimilarity measure between two different images I and J
as the block-wise Hamming distance between 𝑏T
N
and 𝑏T
V
:
where is 0 if the two code blocks are identical, and 1 otherwise
Relocalisation & Loop Closure Detection
77
78. • Submap based approach
• Division of the scene into multiple rigid submaps
• Active submaps: tracked against at each frame
• Passive submaps: maintained, but not tracked against unless they become active again
Globally-Consistent Reconstruction
78
N. Fioraio, J. Taylor, A. Fitzgibbon, L. Di Stefano, and S. Izadi, “Large-scale and drift-free surface reconstruction using online subvolume registration,” 2015 IEEE Co
nf. Comput. Vis. Pattern Recognit., pp. 4475–4483, Jan. 2015.
79. • Graph representation
• Numerical solution of the equation can be obtained by using popular Gauss-Newton
Pose Graph Optimization
79
e8X eXY
eYZ
eZ[
e8
e[X
measurement estimation
G. Grisetti, R. Kummerle, C. Stachniss, and W. Burgard, “A tutorial on graph-based SLAM,” IEEE Intell. Transp. Syst. Mag., vol. 2, no. 4, pp. 31–43, 2010.
82. • Incremental method for localization in 3D point clouds based on segment matching
• URL: https://github.com/ethz-asl/segmap
• Main components
• Front-end
• Sequential factors
• Place recognition factors
• Back-end
• Pose graph optimization
SegMap
82
83. • The front-end is responsible for
• Sensor measurements into sequential factors
• Segmentation and description for place recognition factors
• Sequential factors
• Odometry factors
• Displacement between consecutive robot poses from IMU data
• Scan-matching factors
• Registering the current scan against a local map by point-to-plane ICP
• Place recognition factors
• SegMatch: segment based loop-closure for 3D point clouds
Front-End
83
84. • Place recognition factors
• SegMatch: segment based loop-closure for 3D point clouds
• Four different modules
• Point cloud segmentation
• Descriptor extraction
• Segment matching
• Geometric verification
Front-End
84
85. • Point cloud segmentation
• Incremental region growing policy
• CanGrowTo
• If growing from a seed to a neighbor is allowed
• Check the angle between seed and candidate normals
• LinkClusters
• Link clusters if they have the same cluster id
• CanBeSeed
• If a point can be used as seed
• Check the curvature at a point
SegMatch
85
Region growing Cluster merging
86. • Descriptor extraction
• For compressing the raw segment data and build object signatures
• Diverse segment descriptors
• Eigenvalue based
• Eigenvalues for the segment are computed and combined in a 7 dim feature vector
• Linearity, planarity, scattering, omnivariance, anisotropy, eigenentropy, change of curvature
SegMatch
86
M. Weinmann, B. Jutzi, and C. Mallet, “Semantic 3D scene interpretation: A framework combining optimal neighborhood size selection with relevant fea
tures,” ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci., vol. II-3, no. September, pp. 181–188, 2014.
Eigenvalue-based 3D features
87. • Diverse segment descriptors
• Auto-encoder based
• Input: 3D binary voxel grid of fixed dimension 32x32x16
• Descriptor extractor part
• 3D convolutional layers with max pool layers and two Fully connected layers
• Rectified linear activation function for all layers
• Output: 64x1 descriptor
• Reconstruction part
• One fully connected layer and three deconvolutional layers with a final sigmoid output
• Output: reconstructed 3D binary voxel grid
• Loss
• Classification loss: softmax cross entropy loss
• Reconstruction loss: binary cross entropy loss
SegMatch
87
R. Dubé, A. Cramariuc, D. Dugas, J. Nieto, R. Siegwart, and C. Cadena, “SegMap: 3D Segment Mapping using Data-Driven Descriptors,” 2018.
88. • Geometric verification
• Consistency graph G=(V,E)
• Vertex V = {𝑐"}, the set of correspondences 𝑐"
• Edge E = {𝑒"_} the set of undirected edges 𝑒"_ connecting all consistent pairs of correspo
ndences (𝑐", 𝑐_)
• Geometrically consistent
• If the difference of the Euclidean distance between the segment centroid in the local map and i
n the target map is less than a threshold
• Identifying a maximum geometrically consistent set == finding a maximum clique of G
SegMatch
88
|𝑑a(𝑐", 𝑐_) − 𝑑b(𝑐", 𝑐_)| ≤ 𝜖
90. Back-End
• Pose graph optimization by factor graphs (iSAM library)
• Factor graphs
• Graphical models that are well suited to modeling complex estimation problems
• Variables
• Unknown random variables in the estimation problem
• Robot poses
• Factors
• Probabilistic information on those variables, derived from measurements or prior knowledge
• Odometry, loop closure constraints
90