SlideShare a Scribd company logo
1 of 74
Download to read offline
Deep Learning for Structure-from-Motion (SfM)
https://www.slideshare.net/PetteriTeikariPhD/deconstructing-sfmnet
Dataset creation for Deep Learning-based Geometric Computer Vision problems
https://www.slideshare.net/PetteriTeikariPhD/dataset-creation-for-deep-learningbased-geometric-computer-visio
n-problems
Emerging 3D Scanning Technologies for PropTech
https://www.slideshare.net/PetteriTeikariPhD/emerging-3d-scannng-techno
logies-for-proptech
Geometric Deep Learning
https://www.slideshare.net/PetteriTeikariPhD/geometric-deep-learning
Definitions
Image restoration
concepts and 3D
data structures
Data structures for real estate scans
RGB+D Pixelgrid presenting color and depth
or 2.5D image
Example from
Prof.Li
Mesh (Polygon) from voxel data(“3D pixels”)
Voxel grid meshing using marching cubes (StackExchange)
Point Cloud unordered datatypically (i.e. not on a grid but sparse pointson non-
integercoordinates)
Denoising remove noise from signal
A spatially cohesive superpixel model
for image noise level estimation
Peng Fua, Changyang Li, Weidong Cai, Quansen Sun
Neurocomputing
Volume 266, 29 November 2017, Pages 420-432
https://doi.org/10.1016/j.neucom.2017.05.057
Superpixels generated by
SCSM and SLIC from the
noisy “Fish” image at various
noise levels. (a)–(d) Original and
noisy images; noise SD values of
10, 20, and 30 from (b) to (d),
respectively. (e)–(h) Superpixels
generated by SCSM from the
corresponding images. (i)–(l)
Superpixels generated by SLIC
from the corresponding images.
This paper proposes an
automatic noise level
estimation method. In
contrast with the
conventional rectangular-
block division algorithms, the
images are decomposed into
superpixels that exhibit better
adherence to the local image
structures, thus generating a
division into small regions
that are more likely to be
homogeneous. Moreover, the
effective use of the spatial
neighborhood information
makes the SCSM more
insensitive to image noise
https://sites.google.com/site/pierrickcoupe/softwares/denoising-for-medical-imaging/mri-denoising
http://dx.doi.org/10.1002/jmri.22003
https://youtu.be/5Y7yeRo5vGE
Help selecting noise
reduction plugin for
Photoshop CC 2014
https://www.dpreview.com/forums
/post/54065189
Imagenomic Noiseware,
Neat Image, Nik Software
DFine 2, Topaz DeNoise
5, NoiseNinja
Especially with classical image processing algorithms, it is beneficial to reduce noise before applying the actual processing / analysis.
DeConvolution / Deblurring
Recent Progress in Image Deblurring
Ruxin Wang, Dacheng Tao
(Submitted on 24 Sep 2014)
https://arxiv.org/abs/1409.6838
http://blogs.adobe.com/photoshop/2011/10/behind-all-the-buzz-deblur-sneak-peek.html
Deconvolve the image with the Point Spread Function (PSF) that convolved the scene during image formation to sharpen the image
edge-Aware Image smoothing
Smooth constant patches while retaining sharp edges instead of “dumb Fourier low-pass filter” that destroys the edges
Deep Edge-Aware Filters
http://lxu.me/projects/deepeaf/ | http://proceedings.mlr.press/v37/xub15.html
L0
smoothing
BLF Bilateral Filter
Our method is based on a
deep convolutional neural
network with a gradient
domain training
procedure, which gives
rise to a powerful tool to
approximate various
filters without knowing the
original models and
implementation details.
Efficient High-Dimensional, Edge-Aware Filtering | http://doi.org/10.1109/MCG.2016.119
Hui Huang, Shihao Wu, Minglun Gong, Daniel Cohen-Or, Uri Ascher,
and Hao Zhang, "Edge-Aware Point Set Resampling," ACM
Trans. on Graphics (presented at SIGGRAPH 2013), Volume 32,
Number 1, Article 9, 2013. [PDF | Project page with source code
https://doi.org/10.1145/2421636.2421645
The denoising capability of the blurring-sharpening strategy based
on the tooth volume (mesh). (a-d) are obtained by adding one
particular type of noise, as indicated by the corresponding
captions. SNR (in dB) of the noisy and the smoothed volumes are
shown in each figure.
Super-resolution
Depending on your background, super-resolution mean slightly different things
https://www.ucl.ac.uk/super-resolution:
Super-resolution imaging allows the imaging of fluorescently
labelled probes at a resolution of just tens of nanometers,
surpassing classic light microscopy by at least one order of
magnitude. Recent advances such as the development of photo-
switchable fluorophores, high-sensitivity microscopes and
single-molecule localisation algorithms make super-resolution
imaging rapidly accessible to the wider life sciences research
community.
At UCL we are currently taking a multidisciplinary effort to
provide researchers access to super-resolution imaging systems.
The Super-Resolution Facility (SuRF) currently features
commercial systems supporting the PALM/STORM, SIM and
STED super-resolution approaches.
Beyond diffraction-limited Multiframe
‘Statistical upsampling’
e.g. deep learning
http://www.infrared.avio.co.jp/en/products/ir-thermo/lineup/r500/index.html
http://www.robots.ox.ac.uk/~vgg/research/SR/
https://techcrunch.com/2016/06/20/twitter-is-buying-magic-pony-te
chnology-which-uses-neural-networks-to-improve-images/
Deep Learning for Isotropic Super-Resolution
from Non-Isotropic 3D Electron Microscopy
Larissa Heinrich, John A. Bogovic, Stephan Saalfeld
HHMI Janelia Research Campus, Ashburn, USA
https://arxiv.org/abs/1706.03142
Geometrical super-resolution
Both features extend over 3 pixels but in
different amounts, enabling them to be
localized with precision superior to pixel
dimension
Multi-exposure image noise reduction
When an image is degraded by noise, there can be more detail in the average of
many exposures, even within the diffraction limit. See example on the right.
Single-frame deblurring
Known defects in a given imaging situation, such as defocus or aberrations,
can sometimes be mitigated in whole or in part by suitable spatial-frequency
filtering of even a single image. Such procedures all stay within the diffraction-
mandated passband, and do not extend it.
Sub-pixel image localization
The location of a single source can be determined by computing the "center of
gravity" (centroid) of the light distribution extending over several adjacent
pixels (see figure on the left). Provided that there is enough light, this can be
achieved with arbitrary precision, very much better than pixel width of the
detecting apparatus and the resolution limit for the decision of whether the
source is single or double. This technique, which requires the presupposition
that all the light comes from a single source, is at the basis of what has
becomes known as superresolution microscopy, e.g. STORM, where
fluorescent probes attached to molecules give nanoscale distance
information. It is also the mechanism underlying visual hyperacuity.
Bayesian induction beyond traditional diffraction limit
Some object features, though beyond the diffraction limit, may be known to be
associated with other object features that are within the limits and hence
contained in the image. Then conclusions can be drawn, using statistical
methods, from the available image data about the presence of the full object.
The classical example is Toraldo di Francia's proposition of judging whether an
image is that of a single or double star by determining whether its width
exceeds the spread from a single star. This can be achieved at separations well
below the classical resolution bounds, and requires the prior limitation to the
choice "single or double?"
The approach can take the form of extrapolating the image in the frequency
domain, by assuming that the object is an analytic function, and that we can
exactly know the function values in some interval. This method is severely
limited by the ever-present noise in digital imaging systems, but it can work for
radar, astronomy, microscopy or magnetic resonance imaging. More recently,
a fast single image super-resolution algorithm based on a closed-form solution
l2
problems has been proposed (Zheo et al. 2016) and demonstrated to
accelerate most of the existing Bayesian super-resolution methods
significantly.
WIKIPEDIA: Detail-revealing Deep Video
Super-resolution
Xin Tao, Hongyun Gao, Renjie Liao, Jue Wang, Jiaya Jia (Submitted
on 10 Apr 2017)
https://arxiv.org/abs/1704.02738
Recent deep-learning-based video SR methods [Caballero et al. 2016;
Kappeler et al. 2016] compensate inter-frame motion by aligning all
other frames to the reference one, using backward warping. We show
that such a seemingly reasonable technical choice is actually not
optimal for video SR, and improving motion compensation can directly
lead to higher quality SR results. In this paper, we achieve this by
proposing a sub-pixel motion compensation (SPMC) strategy, which is
validated by both theoretical analysis and extensive experiments.
Optical ordiffractive super-resolution
WIKIPEDIA:
Substituting spatial-frequency bands. Though the bandwidth
allowable by diffraction is fixed, it can be positioned anywhere
in the spatial-frequency spectrum. Dark-field illumination in
microscopy is an example. See also aperture synthesis.
Multiplexing spatial-frequency bands such as structured
illumination, An image is formed using the normal passband of
the optical device. Then some known light structure, for
example a set of light fringes that is also within the passband, is
superimposed on the target. The image now contains
components resulting from the combination of the target and
the superimposed light structure, e.g. moiré fringes, and carries
information about target detail which simple, unstructured
illumination does not. The “superresolved” components,
however, need disentangling to be revealed.
Multiple parameter use within traditional diffraction limit
If a target has no special polarization or wavelength properties,
two polarization states or non-overlapping wavelength regions
can be used to encode target details, one in a spatial-
frequency band inside the cut-off limit the other beyond it.
Both would utilize normal passband transmission but are then
separately decoded to reconstitute target structure with
extended resolution.
Probing near-field electromagnetic disturbance The usual
discussion of superresolution involved conventional imagery of
an object by an optical system. But modern technology allows
probing the electromagnetic disturbance within molecular
distances of the source which has superior resolution
properties, see also evanescent waves and the development
of the new Super lens.
Optical negative-index metamaterials
Nature Photonics 1, 41 - 48 (2007)
doi: 10.1038/nphoton.2006.49 | Cited by 2372
Sub–Diffraction-Limited Optical Imaging
with a Silver Superlens
Science 22 Apr 2005: Vol. 308, Issue 5721, pp. 534-537
doi: 10.1126/science.1108759 | Cited by 3219 articles
Optical and acoustic metamaterials: superlens,
negative refractive index and invisibility cloak
Journal of Optics, Volume 19, Number 8
http://dx.doi.org/10.1088/2040-8986/aa7a1f
→ Special issue on the history of metamaterials
http://zeiss-campus.magnet.fsu.edu/articles/superresolution/supersim.html
Inpainting
Paint over artifacts / missing values using surrounding pixels (“Clone Tool in Photoshop”), or more statistically using the
same image (“Content-Aware Fill”), or bigger databases for example in deep learning pipelines
The TUM-Image Inpainting Database
Technische Universität München
https://www.mmk.ei.tum.de/tumiid/
Context
Encoders:
Feature
Learning by
Inpainting
(2016) Deepak Pathak,
Phillip Krähenbühl, Jeff
Donahue, Trevor Darrell,
Alexei A. Efros
http://people.eecs.berkel
ey.edu/~pathak/context_en
coder/ Improve your skin with Inpaint
https://www.theinpaint.com/
Guillemot and Le Meur (2014)
http://dx.doi.org/10.1109/MSP.2013.2273004
Yang et al. (2017) https://arxiv.org/abs/1611.09969
2D
Image Restoration
Multiframe 2D super-resolution #1
A Unified Bayesian Approach to Multi-Frame
Super-Resolution and Single-Image
Upsampling in Multi-Sensor Imaging
Thomas Köhler, Johannes Jordan, Andreas Maier and Joachim Hornegger
Proceedings of the British Machine Vision Conference (BMVC), pages 143.1-143.12. BMVA Press,
September 2015.
https://dx.doi.org/10.5244/C.29.143
Robust Multiframe Super-Resolution
Employing Iteratively Re-Weighted
Minimization
Thomas Köhler ; Xiaolin Huang ; Frank Schebesch ; André Aichert ; Andreas Maier ;
Joachim Hornegger
IEEE Transactions on Computational Imaging ( Volume: 2, Issue: 1, March 2016 )
https://doi.org/10.1109/TCI.2016.2516909
Future work should consider an adaption of
our prior to blind super-resolution where the
camera PSF is unknown or other image
restoration problems, e. g. image
deconvolution.
In this work, we limited ourselves to non-blind
super-resolution, where the PSF is assumed
to be known. However, iteratively reweighted
minimization could be augmented by blur
estimation. Another promising extension is
joint motion estimation and super-
resolution, e. g. by using the nonlinear least
squares algorithm. Conversely, blur and
motion estimation can also benefit when
using it in combination with our spatially
adaptive model. One further direction of
our future work is to make our approach
adaptive to the scene content, e. g. by a local
selection of the sparsity parameter p.
Classical
Image Acquisition
techniques
Point cloud acquisition
Guide to quickly build high-quality three-dimensional
models with a structured light range scanner
Bao-Quan Shi and Jin Liang
OSA Applied Optics Vol. 55, Issue 36, pp. 10158-10169 (2016)
https://doi.org/10.1364/AO.55.010158 [PDF] researchgate.net
Multiframe techniques or multisweep techniques #1
High Fidelity Scan Merging
Computer Graphics Forum July 2010
http://doi.org/10.1111/j.1467-8659.2010.01773.x
For each scanned object 3D triangulation laser
scanners deliver multiple sweeps corresponding to
multiple laser motions and orientations.
Scan integration as a labelling problem
Pattern Recognition Volume 47, Issue 8, August 2014, Pages 2768-2782
https://doi.org/10.1016/j.patcog.2014.02.008
Example of overlapping scans.
This head is such a complex
structure that not less than 35
scans were acquired to fill in
most holes.
Example of two overlapping
scans, points of each
scanned are first meshed
((c)-(d)) separately. The
result can be compared to
the meshing of points of
both scans together (d)
Comparison of registration
of two scans (colored in
different colors on the top
figure) using Global Non
Rigid Alignment (middle)
and scale space merging
(bottom).
Comparisons of the merging (a) with a
level set (Poisson Reconstruction)
reconstruction method of the
unmerged scans point set (b) and a
filtering of the unmerged scans point
set (c). The level set method
obviously introduces a serious
smoothing, yet does not eliminate the
scanning boundary lines. The bilateral
filter, applied until all aliasing artifacts
have been eliminated, over-smoothes
some parts of the shape.
Multiframe techniques or multisweep techniques #2
Density adaptive trilateral scan integration
method
Bao-Quan Shi and Jin Liang
Applied Optics Vol. 54, Issue 19, pp. 5998-6009 (2015)
https://doi.org/10.1364/AO.54.005998
Multi-Focus Image Fusion Via Coupled
Sparse Representation and Dictionary
Learning
Rui Gao, Sergiy A. Vorobyov (Submitted on 30 May 2017)
Aalto University, Dept. Signal Processing and Acoustics
https://arxiv.org/abs/1705.10574
Standard pipelining of 3D modeling of commercial scanner XJTUOM
Integration of 26 partially overlapping scans
of a dice model. (a) SDF method. (b)
Screened Poisson method. (c) Advancing
front triangulation method. (d) K-means
clustering method. (e) The new method.
The new method is more robust to large gaps/registration errors than previous methods.
Owing to the noise-removal property of the trilateral shifting procedure and mean-shift
clustering algorithm, the new method produces much smoother surfaces.
Multiframe techniques or multisweep techniques #3
Crossmodal point cloud registration in the
Hough space for mobile laser scanning data
Bence Gálai ; Balázs Nagy ; Csaba Benedek
Pattern Recognition (ICPR), 2016
https://doi.org/10.1109/ICPR.2016.7900155
Top row: Point clouds of three different vehicle mounted Lidar systems (Velodyne HDL64 and VLP16 I3D
scanners, and a Riegl VMX450 MMS), captured from the same scene at Fővám Tér, Budapest. Bottom row:
segmentation results for each cloud by our proposed method
Multiframe techniques or multisweep techniques #4
Frame Rate Fusion and Upsampling of
EO/LIDAR Data for Multiple Platforms
T. Nathan Mundhenk ; Kyungnam Kim ; Yuri Owechko
Computer Vision and Pattern Recognition Workshops (CVPRW), 2014 IEEE
https://doi.org/10.1109/CVPRW.2014.117
The left pane shows the PanDAR demonstrator sensors with the
red Ladybug sensor mounted over the silver Velodyne 64E
LIDAR. A custom aluminum scaffold connects the two sensors.
The right pane shows the graphical interface with displays of the
3D model in the top, help menus and the depth map at the
bottom.
Multithreaded programing and GP-GPU methods allow us to obtain 10 fps with a Velodyne 64E LIDAR
completely fused in 360° using a Ladybug panoramic camera.
PanDAR: a wide-area, frame-rate, and full
color lidar with foveated region using
backfilling interpolation upsampling
T. Nathan Mundhenk; Kyungnam Kim; Yuri Owechko
Proceedings Volume 9406, Intelligent Robots and Computer Vision XXXII: Algorithms and
Techniques; 94060K (2015)
Event: SPIE/IS&T Electronic Imaging, 2015, San Francisco, California, United States
http://dx.doi.org/10.1117/12.2078348
Multiframe techniques or multisweep techniques #5
Upsampling method for sparse light
detection and ranging using coregistered
panoramic images
Ruisheng Wang; Frank P. Ferrie
J. of Applied Remote Sensing, 9(1), 095075 (2015)
http://dx.doi.org/10.1117/1.JRS.9.095075
See-through problem and invalid light detection and ranging (LiDAR,
Velodyne HDL-64E) points returned from building interior. (a) Camera
image rendered from a certain viewpoint, (b) corresponding LiDAR
image rendered from the same viewpoint of (a), (c) corresponding
LiDAR image rendered from a top-down viewpoint
“There are a number of improvements that are possible and are topics for future work. The initial depth ordering
that used to determine visibility assumes a piecewise planar partition of the scene. While this can suffice for the
urban environment considered here, a more general approach would consider a richer form of representation, e.g.,
using statistical modeling methods. Cues that are available in the coregistered intensity data, such as the loci of
occluding contours, could also be exploited. At present, our interpolation strategy samples image space to determine
connectivity and backprojects to 3-D, resulting in a nonuniform interpolation. A better solution would be to perform
the sampling in 3-D by backprojecting the 2-D boundary and forming a 3-D bounding box that could then be
interpolated at the desired resolution. In the limit, true multimodal analysis would consider the joint distribution of
both intensity and depth information with the aim of inferring more detailed interpolation functions. With the
availability of sophisticated platforms such as Navteq True, there is clearly an incentive to move in these directions.”
Classical
Image restoration
for LASER Scans
Laser Scanner Super-resolution
https://doi.org/10.2312/SPBG/SPBG06/009-015
Cited by 47 articles
Point cloud processing Reviews
Point Cloud Processing
Raphaële Héno andLaure Chandelier
in 3D Modeling of Buildings. Chapter 5. (2014)
http://doi.org/10.1002/9781118648889.ch5
A review of algorithms for filtering the 3D point cloud
Signal Processing: Image Communication
Volume 57, September 2017, Pages 103-112
https://doi.org/10.1016/j.image.2017.05.009
Octree structuring: point cloud and different
levels of the hierarchical grid
Example of significant noise on
the profile view of a target: the
standard deviation for a point
cloud at target level is 8 mm
A brief discussion of future research directions are presented as follows.
1) Combination of color and geometric information: For point clouds, especially these containing color
information, a pure color or pure geometric attributes based method cannot work well. Hence, it is expected to
combine the color and geometric information in the filtering process to further increase the performance of a
filtering scheme.
2) Time complexity reduction: Because point clouds contain a large number of points, some of which can be up to
hundreds of thousands or even millions of points, computation on these point clouds is time consuming. It is
necessary to develop filtering technologies to filter point cloud effectively to reduce time complexity.
3) Filtering on point cloud sequence: Since object recognition from a point cloud sequence will become the future
research direction. filtering the point cloud sequence will help to improve the performance and accuracy of object
recognition.
Point cloud processing SOFTWARE
PCL Point Cloud Library (C++)
http://pointclouds.org/
https://github.com/PointCloudLibrary
MeshLab with beginner-friendly graphical front-end
http://www.meshlab.net/
https://github.com/cnr-isti-vclab/meshlab
CGAL Computational Geometry
Algorithms Library (C++)
https://www.cgal.org/
https://github.com/CGAL/cgal
Point cloud denoising #1
Similarity based filtering of point clouds
Julie Digne
Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE
https://doi.org/10.1109/CVPRW.2012.6238917
Photogrammetric DSM denoising
Nex, F; Gerke, M. The International Archives of Photogrammetry, Remote Sensing and Spatial
Information Sciences; Gottingen XL.3: 231-238. Gottingen: Copernicus GmbH. (2014)
http://dx.doi.org/10.5194/isprsarchives-XL-3-231-2014
Differences between ground
truth and noisy DSM
Photogrammetric Digital Surface Models (DSM) are usually affected by both random
noise and gross errors. These errors are generally concentrated in correspondence
of occluded or shadowed areas and are strongly influenced by the texture of the
object that is considered, or the number of images employed for the matching.
In the future, further tests will be performed on other real DSM in order to assess the
reliability of the developed method in very different operative conditions. Then, the
extension from the 2.5D case to the fully 3D will be performed and further
comparisons with other available denoising algorithms will be performed, as well.
In addition, a key feature of our method is that it is independent of a surface mesh: it can work
directly on point clouds, which is useful, since building a mesh of a noisy point cloud is never
easy, whereas building a mesh of a properly denoised shape is well understood. A possible
extension for this work would be to use the filter as a projector onto the surface, in a spirit
similar to [Lipman et al. 2007] for example.
Point cloud denoising #2
Point Cloud Denoising via Moving RPCA
E. Mattei, A. Castrodad
Computer Graphics Forum (2106). doi: 10.1111/cgf.13068
Guided point cloud denoising via sharp feature
skeletons
The Visual Computer June 2017, Volume 33, Issue 6–8, pp 857–867
Yinglong Zheng, Guiqing Li, Shihao Wu, Yuxin Liu, Yuefang Gao
https://doi.org/10.1007/s00371-017-1391-8
Denoising synthetic datasets of two planes
meeting at increasingly shallow angles (20.4K
points) with added Gaussian noise of standard
deviation equal to 1% of the length of the
bounding box diagonal. The two planes meet at
an angle of 140º, 150º and 160º. The first and
second rows show the noisy 3D data and 2D
transects, respectively. Rows 3–5 show the
results of the bilateral filter, AWLOP and
MRPCA.
Denoising of the Vienna cathedral SfM model.
The noisy input was processed with MRPCA
followed by a simple outlier removal method
using Meshlab.
Although MRPCA is robust against outliers, this
robustness is achieved only locally. One simple
modification to achieve global outlier robustness
is to use an l1
data fitting term in problem (P6).
Although the l1
-norm will be able to handle the
global outliers better than the Frobenius norm
used in this work, the computational cost will
increase significantly. We point out that the size
of the neighbourhoods is set globally. One
improvement over the current method could be
to make the neighbourhood size a function of
the local point density. This could have a positive
effect when handling datasets with spatially
varying noise
Point cloud Edge Detection #1
Fast and Robust Edge Extraction in Unorganized
Point Clouds
Dena Bazazian ; Josep R. Casas ; Javier Ruiz-Hidalgo
Digital Image Computing: Techniques and Applications (DICTA), 2015
https://doi.org/10.1109/DICTA.2015.7371262
Point cloud Edge Detection #2
Segmentation-based Multi-Scale Edge Extraction
to Measure the Persistence of Features in
Unorganized Point Clouds
Dena Bazazian ; Josep R. Casas ; Javier Ruiz-Hidalgo
"12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications". Porto:
2017, p. 317-325.
http://dx.doi.org/10.5220/0006092503170325
Estimating the neighbors of a sample
point on the ear of bunny at large scales:
(a) far away neighbors may belong to
foreign surfaces when Euclidean
distance is used; (b) geodesic distance
is a better choice to explore large local
neighborhoods; (c) the point cloud can
be segmented to distinguish different
surfaces.
Point cloud super-resolution #1
LidarBoost: Depth superresolution for ToF 3D
shape scanning
Sebastian Schuon ; Christian Theobalt ; James Davis ; Sebastian Thrun
Computer Vision and Pattern Recognition, 2009. CVPR 2009
https://doi.org/10.1109/CVPR.2009.5206804
A new upsampling method for mobile LiDAR
data
Ruisheng Wang ; Jeff Bach ; Jane Macfarlane ; Frank P. Ferrie
Applications of Computer Vision (WACV), 2012 IEEE
https://doi.org/10.1109/WACV.2012.6162998
Real scene - wedges and panels (a): This scene with many depth edges (b) demonstrates the true resolution gain.
Image-based super-resolution (IBSR) (c) demonstrates increased resolution at the edges, but some aliasing
remains and the strong pattern in the interior persists. LidarBoost (d) reconstructs the edges much more clearly
and there is hardly a trace of aliasing, also the depth layers visible in the red encircled area are better captured.
Markov-Random-Field (MRF) upsampling (e) oversmooths the depth edges and in some places allows the low
resolution aliasing to persist.
The main contributions of this paper are
●
A 3D depth sensor superresolution method that incorporates ToF specific knowledge
and data. Additionally a new 3D shape prior is proposed, that enforces 3D specific
properties.
●
A comprehensive evaluation of the working range and accuracy of our algorithm using
synthetic and real data captured with a ToF camera.
●
Only few depth superresolution approaches have been developed previously. We show
that our algorithm clearly outperforms the most related approaches.
Point cloud super-resolution #2
Geometry Super-Resolution by Example
Thales Vieira ; Alex Bordignon ; Thomas Lewiner ; Luiz Velho
Computer Graphics and Image Processing (SIBGRAPI), 2009
https://doi.org/10.1109/SIBGRAPI.2009.10
Laser stripe model for sub-pixel peak
detection in real-time 3D scanning
Ingmar Besic ; Zikrija Avdagic
Systems, Man, and Cybernetics (SMC), 2016 IEEE
https://doi.org/10.1109/SMC.2016.7844912
Our tests show that noise does not vary significantly when observed on
different color channels. Thus estimator algorithms can utilize any of color
channels without sacrificing precision due to significantly increased noise.
However if clever choice it to be made then estimator should opt for the green
channel as it provides the most reliable stripe intensity data for both black and
white surfaces across the whole modulation ROI. We have found that noise
does not have continuous uniform distribution PDF, but normal distribution
PDF and proposed a model that fits empirical data. Our measurements support
assumption that the laser stripe image has approximately Gaussian intensity
profile. However RMSE values show that single Gaussian curve fit is not the
best choice as stripe intensity profile is superposed with surface reflections.
After testing Gaussian fits from 1 to 8 we have concluded that models n > 2 are
not suitable as they produce false peaks or subtract light intensity to achieve
better RMSE. Thus we proposed Gaussian fit with two curves as optimal
model based on the empirical data.
Our future work is to target relations between
coefficients of proposed laser stripe intensity profile
and reduce their number if possible. Preliminary
test show that mean values b1
and b2
of Gaussian
curves tend to be equal or differ for subpixel
amount. It is not yet clear if this difference has any
significance or can be neglected. We also intend to
test our model with different angles between laser
source and target surface.
The proposed method is limited to models which have repeated occurrences of a shape,
and restrict the resolution increase to the regions of those occurrences. Increasing the
resolution of other parts would require inpainting-like tools to extrapolate the geometry
[Sharf et al. 2004], together with a superresolution scheme as an extension of the one
proposed here.
Point cloud super-resolution #3
Super-Resolution of Point Set Surfaces Using
Local Similarities
Azzouz Hamdi-Cherif,, Julie Digne, Raphaëlle Chaine
Computer Graphics Forum 2017
http://dx.doi.org/10.1111/cgf.13216
Super-resolution of a single scan of the Maya point set. Left: initial scan, right: super-resolution. For
visualization purposes, both are reconstructed with Poisson reconstruction
Super-resolution of an input shape with highly repetitive geometric texture. (a) Underlying shape to be
sampled by an acquisition device. (b) Low-resolved input sampling of the shape and local approximation with a
quadric at each point; geometric texture is the residue over the quadric. (c) Super-resolved re-sampling using our
method (fusion of super-resolved local patches). Right column: Generation of the super-resolved patches. (d)
Construction of a local descriptor of the residue over a low-resolution grid corresponding to the unfolded quadric;
blue points represent the height values estimated at bin centres, red points are the input points. (e) Similar
descriptor points are added (orange points) to the input points (in red) of the local descriptor. (f) A super-resolved
descriptor is computed from the set of red and orange points
Super-resolution of a single scan of the Persepolis point
set. Left: initial scan, right: super-resolution. The shape
details appear much sharper after the super-resolution
process. Parameters: r = 4 (Shape diagonal: 114), nbinslr
=
64, nbinssr
= 400 and rsim
= 0.2.
Point cloud Classification #1
Point cloud Classification #2
Multi-class US traffic signs 3D recognition and
localization via image-based point cloud model
using color candidate extraction and texture-
based recognition
Vahid Balali, Arash Jahangiri and Sahar Ghanipoor Machiani
Advanced Engineering Informatics Volume 32, April 2017, Pages 263-274
https://doi.org/10.1016/j.aei.2017.03.006
An improved Structure-from Motion (SfM) procedure is developed to create a
clean 3D point cloud from the street level imagery and assist with accurate 3D
localization by color and texture features extraction. The detected traffic signs are
triangulated using camera pose information and their corresponding locations are
visualized in 3D environment. The proposed method as shown in Fig. 1, mainly consists
of three key components:
1) Detecting and classifying traffic signs using 2D images;
2) Reconstructing and automatically cleaning a 3D point cloud model; and
3) Recognizing and localizing traffic signs in 3D environment.
Point cloud Clustering for simplification #1
Adaptive simplification of point cloud using k-
means clustering
Bao-Quan Shi, Jin Liang, Qing Liu
Computer-Aided Design Volume 43, Issue 8, August 2011, Pages 910-922
https://doi.org/10.1016/j.cad.2011.04.001
A parallel point cloud clustering algorithm for
subset segmentation and outlier detection
Christian Teutsch , Erik Trostmann,, Dirk Berndt
Proceedings Volume 8085, Videometrics, Range Imaging, and Applications XI; 808509 (2011)
http://dx.doi.org/10.1117/12.888654
Cluster initialization of the Stanford bunny. Left: input
data. Middle: initialization of the cluster centroids. Right:
initial clusters are formed, and one cluster is shown in one
color.
If the noise of the 3D point cloud is
serious, effective noise filtering should
be conducted before the simplification.
The proposed method can also simplify
multiple 3D point clouds. Our future
research will concentrate on simplifying
multiple 3D point sets simultaneously
For example, a point set with two million coordinates is analyzed within three seconds and 15 million points within
35 seconds on Intel Core2 processor. It handles arbitrary ndimensional data formats, e.g. with additional color
and/or normal vector information since it is implemented as a template class. The algorithm is easy to parallelize
which further increases the computation performance on multi-core machines for most applications. The
feasibility of our clustering technique has been evaluated at the example of a variety of point clouds from
different measuring applications and 3D scanning devices.
Point cloud Object detection #1
Object Detection in Point Clouds Using Conformal
Geometric Algebra
Aksel Sveier, Adam Leon Kleppe, Lars Tingelstad and Olav Egeland
Advances in Applied Clifford Algebras 2017
http://dx.doi.org/10.1007/s00006-017-0759-1
In this paper we focus on the detection of primitive geometric
models in point clouds using RANSAC. A central step in the
RANSAC algorithm is to classify inliers and outliers. We show that
conformal geometric algebra (CGA) enable filters with geometrical
interpretation for inlier/outlier classification. The last step of the
RANSAC algorithm is fitting the primitive to its inliers. This can be
performed analytically with CGA, and the method is identical for
both planes and spheres.
Setup of the robotic pick-
and-place demonstration.
Point clouds from the 3D
camera is used for detecting
the plane, spheres and
cylinder. The information is
sent to the robot arm, which
is used to place the spheres
in the cylinder
Spheres were successfully detected in point clouds with up to 90%
outliers and cylinders could successfully be detected in point
clouds with up to 80% outliers. We suggested two methods for
constructing a cylinder from point data using CGA and found that
fitting two spheres to a cylinder gave performance advantages
compared to constructing a circle and line from 3 points on the
cylinder surface.
Point cloud Compression static #1
Research on the Self-Similarity of Point Cloud
Outline for Accurate Compression
Xuandong An; Xiaoging Yu ; Yifan Zhang
2015 International Conference on Smart and Sustainable City and Big Data (ICSSC)
http://dx.doi.org/10.1049/cp.2015.0272
The Lovers of Bordeaux (15.8 million
points). Exploiting self-similarity in the
model, we compress this representation
down to 1.15 MB. The resulting model
(right) is very close to the original one
(left), as the reconstruction error is less
than the laser scanner precision
(0.02mm) for 99.14% of the input points.
Point cloud compression approaches have mostly dealt with coordinates quantization via recursive
space partitioning [Gandoin and Devillers 2002; Schnabel and Klein 2006; Huang et al. 2006;
Smith et al. 2012]. In a nutshell, these approaches consist in inserting the points in a space partitioning
data structure (e.g. octree, kd-tree) of given depth, and to replace them by the center of the cell they
belong to.
Self similarity of measured signals has gained interest over the past decade: research on signal
processing as well as image processing has accomplished outstanding progress by taking advantage of
the self-similarity of the measure. In the image processing field, the idea originated in the non-local
means algorithm [Buades et al. 2005]: instead of denoising a pixel using its neighboring pixels, it is
denoised by exploiting pixels of the whole image looking similar. The similarity between pixels is
computed by comparing patches around them. Behind this powerful tool lies the idea that pixel far away
from the considered area might entail information that will help processing it, because of the natural self-
similarity of the image.
Self-similarity of surfaces has mainly been exploited for surface denoising applications: the non local
means filter has been adapted for surfaces be it meshes [Yoshizawa et al. 2006] or point clouds [
Adams et al. 2009; Digne 2012]. It was also used to define a Point Set Surface variant [
Guillemot et al. 2012] exhibiting better robustness to noise. Self-similarity of surfaces is obviously not
limited to denoising purposes. For example, analyzing the similarity of a surface can lead to detect
symmetries or repetition structures in surfaces [Mitra et al. 2006; Pauly et al. 2008]. An excellent
survey of methods exploiting symmetry in shapes can be found in [Mitra et al. 2013].
There are several ways in which our compression scheme could be improved:
● Exploiting patch-based representation, artifacts may appear in case of boundaries, which could be
dilated throughout decompression. One could mitigate this issue by adjusting the patch size
(clipping some outer grid cells) along boundaries. This would require to store one small integer for
each patch, at a small cost.
● Other seed picking strategies could be implemented, for example by placing the seeds so that they
minimize the local error, in the spirit of [Ohtake et al. 2006].
● Encoding per-point attribute such as normals and colors is possible with the same similarity-based
coder.
Perspectives: Although our algorithm is based on the exploitation of self-similarity on the whole surface,
most of the involved treatments remain local. This is a good prospect for handling data of ever increasing
size, using streaming processes. This is particularly important at a time when the geometric digitization
campaigns sometimes cover entire cities.
Point cloud Compression Dynamic #1 voxelized
Motion-Compensated Compression of Dynamic
Voxelized Point Clouds
Ricardo L. de Queiroz ; Philip A. Chou
IEEE Transactions on Image Processing ( Volume: 26, Issue: 8, Aug. 2017 )
https://doi.org/10.1109/TIP.2017.2707807
As a new concept for a new application, much has still to be fine tuned and perfected.
For example, the post-processing (in-loop or otherwise) is far from reaching its peak
performance. Both the morphological and the filtering operations are not well
understood in this context. Similarly, the distortion metrics or the voxel matching
methods are not developed to a satisfactory point. There is still plenty of work to be
done to extend the present framework to use B-frames (bidirectional prediction) and
to extend the GOF to a more typical IBBPBBP... format. Furthermore, we want to use
adaptive block sizes, which are optimally selected in an RD sense and we also want to
encode both the geometry and the color residues for the predicted (P and B) blocks.
Finally, rather than re-using the correspondences from the surface reconstruction
among consecutive frames, we want to develop efficient motion estimation
methods for use with our coder. Each of these enhancements should improve the
coder performance, such that there is a continuous sequence of improvements in this
new frontier to be explored.
Point cloud Compression Dynamic #2
Graph-Based Compression of Dynamic 3D Point
Cloud Sequences
Dorina Thanou ; Philip A. Chou ; Pascal Frossard
IEEE Transactions on Image Processing ( Volume: 25, Issue: 4, April 2016 )
https://doi.org/10.1109/TIP.2016.2529506
Example of a point cloud of the ‘yellow dress’
sequence (a). The geometry is captured by a
graph (b) and the r component of the color is
considered as a signal on the graph (c). The size
and the color of each disc indicate the value of the
signal at the corresponding vertex.
Octree decomposition of a 3D model for two
different depth levels. The points belonging to each
voxel are represented by the same color.
There are a few directions that can be explored in the future. First, it has
been shown in our experimental section that a significant part of the bit
budget is spent for the compression of the 3D geometry, which given a
particular depth of the octree, is lossless. A lossy compression scheme that
permits some errors in the reconstruction of the geometry could bring
non-negligible benefits in terms of the overall rate-distortion performance.
Second, the optimal bit allocation between geometry, color and motion vector
data stays an interesting and open research problem, due mainly to the
lack of a suitable metric that balances geometry and color visual quality.
Third, the estimation of the motion is done by computing features based
on the spectral graph wavelet transform. Features based on data-driven
dictionaries, such as the ones proposed in [Thanou et al. 2014], are
expected to increase significantly the matching, and consequently the
compression performance.
Dynamic meshes laplace operator
A 3D+t Laplace operator for temporal mesh
sequences
Victoria Fernández Abrevaya , Sandeep Manandhar, Franck Hétroy-Wheeler, Stefanie Wuhrer
Computers & Graphics Volume 58, August 2016, Pages 12-22
https://doi.org/10.1016/j.cag.2016.05.018
In this paper we have introduced a discrete Laplace operator for temporally coherent
mesh sequences. This operator is defined by modelling the sequences as CW complexes
in a 4-dimensional Riemaniann space and using Discrete Exterior Calculus. A
userdefined parameter is associated to the 4D space to control the influence of motionα
with respect to the geometry. We have shown that this operator can be expressed by a
sparse blockwise tridiagonal matrix, with a linear number of non zero coefficients with
respect to the number of vertices in the sequence. The storage overhead with respect to
frame-by-frame mesh processing is limited. We have also shown an application example,
as-rigid-as-possible editing, for which it is relatively easy to extend the classical static
Laplacian framework to mesh sequences with this matrix. Similar results to state-of-the-
art methods can be reached with a simple, global formulation.
This opens the possibility of many other problems in animation processing to be
tackled the same way by taking advantage of the existing literature on the Laplacian
operator for 3D meshes [Zhang et al. 2010]. In the future, we are in particular interested in
studying the spectral properties of the defined discrete Laplace operator.
Point cloud inpainting #1
Region of interest (ROI) based 3D inpainting
Shankar Setty, Himanshu Shekhar, Uma Mudenagudi
Proceeding SA '16 SIGGRAPH ASIA 2016 Posters Article No. 33
https://doi.org/10.1145/3005274.3005312
Point Cloud Data Cleaning and Refining for 3D As-
Built Modeling of Built Infrastructure
Abbas Rashidi and Ioannis Brilakis
Construction Research Congress 2016
http://sci-hub.cc/10.1061/9780784479827.093
Future experiments will also be required to quantitatively measure the
accuracy of the presented algorithms especially for the case of outliers’
removal. Developing robust algorithms for automatically recognizing 3D
objects throughout the built infrastructure PCD and therefore enhancing the
object oriented modeling stage is another possible direction for future
research.
Point cloud inpainting #2A
Dynamic occlusion detection and inpainting of in
situ captured terrestrial laser scanning point
clouds sequence
Chi Chen, Bisheng Yang
IEEE Transactions on Image Processing ( Volume: 25, Issue: 4, April 2016 )
https://doi.org/10.1016/j.isprsjprs.2016.05.007
In future work, the
proposed method will be
extended to incorporate
multiple geometric
features (e.g. shape index,
normal vector https://github.com/aboulch/normals_Hough
)
of local point distributions
to measure the geometric
consistency in the
background modeling
stage, aiming for higher
recall of the background
points during inpainting.
Point cloud inpainting #2B
Point cloud Quality assessment #1
Towards Subjective Quality Assessment of Point
Cloud Imaging in Augmented Reality
Alexiou, Evangelos; Upenik, Evgeniy; Ebrahimi, Touradj
IEEE 19th International Workshop on Multimedia Signal Processing, Luton Bedfordshire, United Kingdom, October 16-18, 2017
https://infoscience.epfl.ch/record/230115
On the performance of metrics to predict quality
in point cloud representations
Alexiou, Evangelos; Ebrahimi, Touradj
SPIE Optics + Photonics for Sustainable Energy, San Diego, California, USA, August 6-10, 2017
https://infoscience.epfl.ch/record/230116
As it can be observed, our results show strong correlation between objective metrics and subjective scores in the
presence of Gaussian noise. The statistical analysis shows that the current metrics perform well when Gaussian noise is
introduced. However, in the presence of compression-like artifacts the performance is lesser for every type of content,
leading to a conclusion that the performance is content dependent. Our results show that there is a need for better
objective metrics that can more accurately predict all practical types of distortions for a wide variety of contents.
absolute category rating (ACR)
double-stimulus impairement scale (DSIS)
Point cloud Quality assessment #2
A statistical method for geometry
inspection from point clouds
Francisco de Asís López, Celestino Ordóñez, Javier Roca-Pardiñas , Silverio García-Cortés
Applied Mathematics and Computation Volume 242, 1 September 2014, Pages 562-568
https://doi.org/10.1016/j.amc.2014.05.130
Assessing planar asymmetries in shipbuilding
from point clouds
Javier Roca-Pardiñas , Celestino Ordõnez , Carlos Cabo , Agusín Menéndez-Díaz
Measurement Volume 100, March 2017, Pages 252-261
https://doi.org/10.1016/j.measurement.2016.12.048
In this paper, a statistic test to perform geometry inspection is described. The
methodology used allows, by means of bootstrapping techniques, to obtain a p-value
for the statistical hypothesis established.
An important aspect of the developed methodology, proved by means of a simulated
experiment, is its capacity to control type I errors while it is able to reject the null
hypothesis when it is false. This experiment showed that the performance of the
method improves when the point density increases.
The proposed method was applied to the inspection of a parabolic dish antenna, and
the results show that it does not fit its theoretical shape, unless a 1 mm tolerance is
admitted.
It is noteworthy that although the method has been exposed as a global test for
geometry inspection, it would also be possible to apply it to inspect different parts of
the object under study.
Yatch hull surface
estimated from the point
cloud.
Point cloud Quality assessment #3 Defect detection
Automated Change Diagnosis of Single-
Column-Pier Bridges Based on 3D
Imagery Data
Ying Shi; Wen Xiong, Ph.D., P.E., M.ASCE; Vamsisai Kalasapudi; Chao Geng
ASCE International Workshop on Computing in Civil Engineering 2017
http://doi.org/10.1061/9780784480830.012
The future work will
include understanding
the correlation
between the
deformation of the
girder and column
with the change in
the thickness of the
connected bearing.
Such correlated
change analysis will
aid in understanding
the cause of the
observed thickness
variation and
performing reliable
condition diagnosis of
all the single pier
bridges.
Point cloud Quality assessment #4 with uncertainty
Point cloud comparison under
uncertainty. Application to beam bridge
measurement with terrestrial laser
scanning
Francisco de Asís López, Celestino Ordóñez, Javier Roca-Pardiñas, Silverio García-Cortés
Measurement Volume 51, May 2014, Pages 259-264
https://doi.org/10.1016/j.measurement.2014.02.013
Assessment of along-normal uncertainties for
application to terrestrial laser scanning surveys of
engineering structures
Tarvo Mill, Artu Ellmann
Survey Review (2017) Vol. 0 , Iss. 0,0
http://dx.doi.org/10.1080/00396265.2017.1361565
Future studies should more closely investigate the
dependence of results of different TLS signal
processing methods and also applicability of
combined standard uncertainty (CSU, Bjerhammar 1973;
Niemeier and Tengen 2017), equations considering also
systematic error in TLS surveys.
The application of the proposed
methodology to compare two
point clouds of a beam bridge
measured with two different
scanner systems, showed
significant differences in parts of
the beam. This is important in
inspection works since different
conclusions could be reached
depending on the measuring
instrument.
PDE-based Point cloud processing
Partial Difference Operators on Weighted Graphs
for Image Processing on Surfaces and Point
Clouds
François Lozes ; Abderrahim Elmoataz ; Olivier Lézoray
IEEE Transactions on Image Processing ( Volume: 23, Issue: 9, Sept. 2014 )
https://doi.org/10.1109/TIP.2014.2336548
PDE-Based Graph Signal Processing for 3-D
Color Point Clouds : Opportunities for
cultural heritage
François Lozes ; Abderrahim Elmoataz ; Olivier Lézoray
IEEE Signal Processing Magazine ( Volume: 32, Issue: 4, July 2015 )
https://doi.org/10.1109/MSP.2015.2408631
The approach allows processing of signal data on point clouds (e.g.,
spectral data, colors, coordinates, and curvatures). We have applied
this approach for cultural heritage purposes on examples aimed at
restoration, denoising, hole-filling, inpainting, object extraction, and
object colorization.
Sparse coding and point clouds #1
Cloud Dictionary: Sparse Coding and Modeling for
Point Clouds
Or Litany, Tal Remez, Alex Bronstein
(Submitted on 15 Dec 2016 (v1), last revised 20 Mar 2017 (this version, v2))
https://arxiv.org/abs/1612.04956
Sparse Geometric Representation Through
Local Shape Probing
Julie Digne, Sébastien Valette, Raphaëlle Chaine
(Submitted on 7 Dec 2016)
https://arxiv.org/abs/1612.02261
With the development of range
sensors such as LIDAR and time-of-
flight cameras, 3D point cloud scans
have become ubiquitous in computer
vision applications, the most
prominent ones being gesture
recognition and autonomous driving.
Parsimony-based algorithms have
shown great success on images and
videos where data points are sampled
on a regular Cartesian grid. We
propose an adaptation of these
techniques to irregularly sampled
signals by using continuous
dictionaries. We present an example
application in the form of point cloud
denoising
Building Information models (BIM) and point clouds
An IFC schema extension and binary serialization
format to efficiently integrate point cloud data into
building models
Thomas Krijnen, Jakob Beetz
Advanced Engineering Informatics Available online 3 April 2017
https://doi.org/10.1016/j.aei.2017.03.008
Building elements, which can be represented by various forms of geometry, including 2D and 3D line drawings, Constructive
Solid Geometry (CSG), Boundary Representations (BRep) and tessellated meshes. However, these three-dimensional
representations are just one of the many aspects conveyed in an IFC model. In addition, attributes related to thermal or
acoustic performance, costing or intended use of spaces etc. can be added.
In many common data formats for the storage of point cloud data, such as E57 and PCD, metadata is attached to individual
data sets. This metadata for example includes scanner positions or weather conditions that are perceived during the scan.
From the acquisition process, the point data itself contains no grouping, decomposition or other information that relates the
points to the semantic meaning of the real-world object that was scanned. In subsequent processing steps such labels are
often added to the points. Several exchange formats, such as LAS, have options to store labels along with the points.
The magnitude of the data which is typically found in point cloud data sets and IFC model populations can be
dramatically different for the two file types. A meaningful IFC file can have file sizes in the order of a few megabytes, if
geometrical representations and property values are properly reused and especially when the file contains implicit,
parametric, rather than tessellated geometry. Depending on the amount of detail and precision, point cloud scans can
easily amount to gigabytes of data. Despite the larger size, due the uniform structure and explicit nature, point clouds can
typically be more immediately explored than IFC building models, for which the boolean operations and implicit geometries
need to be evaluated prior to visualization.
The need for a unified and harmonized storage model of the two data types is observed in literature [e.g. Li et al 2008;
Golparvar-Fard et al. 2011]. Yet, the authors acknowledge that other use cases will exist in which a deep coupling between
building models and point clouds is unnecessary or even undesirable. This paper presents an extension to the IFC schema
by which an open andsemantically rich standard arises.
Future: One of the core advantages of the HDF5 format is the usage of transparent
block-level compression. HDF5 allows several compression schemes, including
user-defined compression methods. These would allow much higher compression
ratios by exploiting structural knowledge of the point cloud or by introducing
additional lossiness in the compression methods. In the prototypical implementation
only gzip compression is used. Especially the point clouds segments stored as
height maps projected on parametric surfaces might be suitable for specific-
purpose compression methods, such as jpeg or png, which can exploit and filter
imperceivable differences.
Lastly, future research will indicate how the associated point cloud structure
presented in this paper can be paired with other spatial indexing structures to
further advance the localized extraction of point cloud segments and spatial
querying techniques. Further experiments will be conducted to harness and reuse
the general purpose decomposition and aggregation relationships of the IFC to
implement octrees and kd-trees to further enhance the structure and accessibility
of the data.
Dynamic Surface Mesh Detail enhancement #1
Multi-scale geometric detail
enhancement for time-varying surfaces
Graphical Models Volume 76, Issue 5, September 2014, Pages 413-425
https://doi.org/10.1016/j.gmod.2014.03.010
We first develop an adaptive spatio-temporal bilateral filter, which produces temporally-
coherent and feature-preserving multi-scale representation for the time-varying surfaces.
We then extract the geometric details from the time-varying surfaces, and enhance
geometric details by exaggerating detail information at each scale across the time-varying
surfaces.
Velocity vectors estimation. The top row gives 4 frames in the time-varying
surfaces, and the bottom row gives the corresponding velocity vectors for each
frame
Multi-scale
representation and
detail enhancement
for time-varying
surfaces. First row:
Input time-varying
surfaces, second row:
multi-scale filtering
results by filtering
each frame
individually, third
rows: multi-scale
filtering results using
adaptive spatial–
temporal filter, fourth
and fifth rows: multi-
scale detail
enhancement results
using 6 levels and 9
detail levels,
respectively.
Limitations: In our current detail transfer results, we only transfer the
detail of a static model to time-varying surfaces. Our current algorithm
cannot transfer the geometry detail of time-varying surfaces to target
time-varying surfaces, which is challenging since it is difficult to build
the corresponding mapping between the source and target time-
varying surfaces with different surface frames.
Another problem is that although our filtering and enhancement
methods can alleviate the jittering artifacts, for input time-varying
surfaces with heavy jittering, the jittering artifacts still cannot be
removed completely. Processing surface sequences with heavy jittering
is a very hard problem, which requires further sophisticated
investigation.
Surface reconstruction Data Priors #1
Surface reconstruction with data-driven exemplar
priors
Oussama Remil, Qian Xie, Xingyu Xie, Kai Xu, Jun Wang
Computer-Aided Design Volume 88, July 2017, Pages 31-41
https://doi.org/10.1016/j.cad.2017.04.004
Given a noisy and sparse point cloud of structural complex mechanical part as input, our system produces the
consolidated points by aligning exemplar priors learned from a mechanical shape database. With the additional
information such as normals carried by our exemplar priors, our method achieves better feature preservation than
direct reconstruction on the input point cloud (e.g., Poisson).
An overview of our algorithm. We extract priors from a 3D shape database within the same
category (e.g., mechanical parts) to construct a prior library. The affinity propagation clustering
method is then performed on the prior library to obtain the set of representative priors, called the
exemplar priors. Given an input point cloud, we construct its local neighborhoods and perform
priors matching to find the similar exemplar prior to each local neighborhood. Subsequently, we
utilize the matched exemplar priors to consolidate the input point scan through an augmentation
procedure, with which we can generate the faithful surface where sharp features and fine details
are well recovered.
Limitations Our method is expected to
behave well with different shape
categories, meanwhile there are a few
limitations that have to be discussed so
far. Our algorithm fails when dealing
with more challenging repositories with
small number of redundant elements,
such as complex organic shapes. In
addition, if there are large holes within
the input scans or big missing parts,
our method may fail to complete them
based on the “matching-to-alignment”
strategy.
Surface reconstruction Data Priors #2A
3D Reconstruction Supported by Gaussian
Process Latent Variable Model Shape Priors
Jens Krenzin, Olaf Hellwich
PFG – Journal of Photogrammetry, Remote Sensing and Geoinformation Science
May 2017, Volume 85, Issue 2, pp 97–112
https://doi.org/10.1007/s41064-017-0009-0
A 2D shape representing a filled circle, where black represents the outside of the object and white represents the inside of the object. b
Corresponding signed distance function (SDF) for the shape shown in a. The 0-level is highlighted in red. c Discrete Cosine Transform (DCT)
coefficients for the SDF shown in b. The first 15 DCT coefficients in each dimension store the important information about the shape. The
remaining coefficients are nearly zero
Surface reconstruction Data Priors #2B
3D Reconstruction Supported by Gaussian
Process Latent Variable Model Shape Priors
Jens Krenzin, Olaf Hellwich
PFG – Journal of Photogrammetry, Remote Sensing and Geoinformation Science
May 2017, Volume 85, Issue 2, pp 97–112
https://doi.org/10.1007/s41064-017-0009-0
Results for object A—cup. a Sample image.
b Erroneous point cloud. c Ground truth. d
Shape prior. e Corrected point cloud
This article presents a method that removes outliers, reduces noise and fills holes in a
point cloud using a learned shape prior. The shape prior is learned from a set of training
objects using the GP-LVM.
It has been shown that an interpolated shape between several training shapes often has ringing
artefacts due to the DCT compression step. Several investigations were made on how these
artefacts could be reduced. In the first investigation, the difference between the training shapes
was reduced and the latent space became denser. As expected this reduced the Euclidean distance
from one training example to the nearest training example. The closer two points are in the latent
space, the more similar the corresponding shapes are. As a result of this the artefacts are reduced,
but only slightly.
In the second investigation, the DCT compression step was removed. The GP-LVM then learns a
lower dimensional subspace directly on the SDF. It has been shown that this leads also to a slight
reduction of the artefacts of the reconstructed shape, but the artefacts are still visible. In this work
the GP-LVM was investigated as a candidate fulfilling the requirements. It has been shown that the
number of shape parameters can be reduced, and that the model can be trained for specific object
classes. Some of the experiments, related to model sparsity and well-behavedness, have
discovered weaknesses of the presented method. Theseissues will be further investigated in future
work.
Classical
Image restoration
Depth maps
Depth MAP Inpainting #1
Kinect depth inpainting in real time
Lucian Petrescu ; Anca Morar ; Florica Moldoveanu ; Alin Moldoveanu
Telecommunications and Signal Processing (TSP), 2016
https://doi.org/10.1109/TSP.2016.7760974
Example of output from median filter: A) input depth map where black pixels are
not sampled; B) output image after applying the median filter; C) difference
between input and output: grayscale – sampled pixel, blue – inpainted; D)
confidence: blue-filtered, white-sampled, red –unfiltered.
Depth MAP Inpainting #2
A new method for inpainting of depth maps
from time-of-flight sensors based on a modified
closing by reconstruction algorithm
Journal of Visual Communication and Image Representation
Volume 47, August 2017, Pages 36-47
https://doi.org/10.1016/j.jvcir.2017.05.003
This procedure uses a modified
morphological closing by
reconstruction algorithm.
Finally, the proposed method works
properly in depth maps where there is a
sufficient good definition of regions or
at least the enough to be able to infer
the missing information, e.g., depth
maps obtained in indoor scenarios or
acquired with sensors or methods that
achieve these characteristics. Low-
quality depth maps and those acquired
in outdoor conditions may require
additional pre-processing stages or even
more robust methods because of the
size of the holes presented in such
images seems to be larger.
Filling Kinect depth holes via position-guided
matrix completion
Zhongyuan Wang, Xiaowei Song , ShiZheng Wang, Jing Xiao, Rui Zhong, Ruimin Hu
Neurocomputing Volume 215, 26 November 2016, Pages 48-52
https://doi.org/10.1016/j.neucom.2015.05.146
Depth MAP Inpainting #3
Learning-based super-resolution
with applications to intensity and
depth images
Haoheng Zheng, University of Wollongong, Doctor of Philosophy thesis, School
of Electrical, Computer and Telecommunications Engineering, University of
Wollongong, 2014. http://ro.uow.edu.au/theses/4284
Geometric Inpainting of 3D Structures
Pratyush Sahay, A. N. Rajagopalan
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2015, pp. 1-7
https://doi.org/10.1109/CVPRW.2015.7301388
the proposed framework,
albeit with occasional minor
local artifacts.
“Low-rank Theory”
[184] Candes et al. (2011) “Robust principal
component analysis?” Journal of the ACM
(JACM) Volume 58 Issue 3, May 2011
https://doi.org/10.1145/1970392.1970395
Depth MAP super-resolution #1
Depth map super resolution
Murat Gevrekci ; Kubilay Pakin
Image Processing (ICIP), 2011 18th IEEE
https://doi.org/10.1109/ICIP.2011.6116454
Depth map acquisation with ToF
camera with different
integration times. Image on the
left is captured with 10ms
integration time. Note how the
background depth information is
noisy. Image on right is captured
with 50ms integration.
Background depth is captured
reliably at the expense of
saturating the near field with high
integration time.
We propose changing our constraint sets
not only on a single range image but on
differently exposed range images to
increase depth resolution within whole work
space. This concept has resemblance to
High Dynamic Range (HDR)
[Gevrekci and Gunturk 2007] image formation
using differently exposed images. Proposed
algorithm will merge useful depth
information from different levels and
eliminate contaminated data (i.e. saturation,
noise).
Modeling the imaging pipeline is a critical
step in image enhancement as
demonstrated by the author
[Gevrekci and Gunturk 2005]. We propose
modeling the depth map as a function of
internal camera parameters, object and
camera motion, and photometric changes
due to camera response function and
alternating integration time.
Spatially Adaptive Tensor Total Variation-Tikhonov
Model for Depth Image Super Resolution
Gang Zhong ; Sen Xiang ; Peng Zhou ; Li Yu
IEEE Access ( Volume: 5, 2017 )
https://doi.org/10.1109/ACCESS.2017.2715981
Visual comparison of 4× super resolution results on our synthetic scene Chess: (a) the groundtruth depth
image. Super resolution results of (b) using Tikhonov regularization.(c) using total variation regularization. (d)
using color guided tensor total variation regularization. (e) using fused edge map guided tensor total variation
regularization. (f) using the spatially adaptive tensor total variation - Tikhonov regularization.
Depth MAP super-resolution #2
Image-guided ToF depth upsampling: a survey
Iván Eichhardt, Dmitry Chetverikov, Zsolt Jankó
Machine Vision and Applications May 2017, Volume 28, Issue 3–4, pp 267–282
https://doi.org/10.1007/s00138-017-0831-9
Effect of imprecise calibration on depth upsampling. The discrepancy between the
input depth and colour images is 2, 5 and 10 pixels, respectively
Effect of optical radial distortion on depth upsampling
Depth Map super-resolution #3
Super-resolution Reconstruction for
Binocular 3D Data
Wei-Tsung Hsiao ; Jing-Jang Leou ; Han-Hui Hsiao
Pattern Recognition (ICPR), 2014
https://doi.org/10.1109/ICPR.2014.721
Depth Superresolution using Motion
Adaptive Regularization
Ulugbek S. Kamilov, Petros T. Boufounos (Submitted on 4 Mar 2016)
https://arxiv.org/abs/1603.01633
Our motion adaptive method recovers a high-
resolution depth sequence from high-resolution
intensity and low-resolution depth sequences by
imposing rank constraints on the depth
patches: (a) and (b) t-y slices of the color and
depth sequences, respectively, at a fixed x; (c)–
(e) x-y slices at t1 = 10; (f)–(h) x-y slices at t2 =
40; (c) and (f) input color images; (d) and (g)
input low-resolution and noisy depth images; (e)
and (h) estimated depth images.
Illustration of the block matching within a space-time search
area. The area in the current frame t is centered at the reference
patch. Search is also conducted in the same window position in
multiple temporally adjacent frames. Similar patches are grouped
together to construct a block p = Bp .β φ
Visual evaluation on Road video sequence.
Estimation of depth from its 3× downsized
version at 30 dB input SNR. Row 1 shows the
data at time instance t = 9. Row 2 shows the
data at the time instance t = 47. Row 3 shows
the t-y profile of the data at x = 64. Highlights
indicate some of the areas where depth
estimated by GDS-3D recovers details missing
in the depth estimate of DS-3D that does not
use intensity information.
Depth Map super-resolution #4
Depth Map Restoration From
Undersampled Data
Srimanta Mandal ; Arnav Bhavsar ; Anil Kumar Sao
IEEE Transactions on Image Processing ( Volume: 26, Issue: 1, Jan. 2017 )
https://doi.org/10.1109/TIP.2016.2621410
The objective of the paper: (a) Uniform up-sampling of an LR depth map i.e., filling up missing information in an
HR grid generated from a uniformly sampled LR depth map – can be addressed by SR (b) Non-uniform up-
sampling a sparse point cloud i.e., filling up the missing information in a randomly filled HR grid – can be
addressed by PCC, (c) An extreme case of non-uniform up-sampling, where very less data is available. We
suggest an approach wherein this is interpreted as non-uniform up-sampling followed by uniform up-sampling –
can be addressed by PCC-SR.
We have addressed the
problem of depth
restoration by up-
sampling either the
uniformly sampled LR
depth map or sparse
non-uniformly
sampled point cloud in
a unified sparse
representation
framework.
Depth MAP Joint Superresolution-Inpainting #1
Range map superresolution-inpainting, and
reconstruction from sparse data
Computer Vision and Image Understanding
Volume 116, Issue 4, April 2012, Pages 572-591
https://doi.org/10.1016/j.cviu.2011.12.005
Depth map inpainting and super-resolution based
on internal statistics of geometry and appearance
Satoshi Ikehata ; Ji-Ho Cho ; Kiyoharu Aizawa
Image Processing (ICIP), 2013 20th IEEE
https://doi.org/10.1109/ICIP.2013.6738194
In this paper, we have proposed depth-
map inpainting and super-resolution
algorithms which explicitly capture the
internal statistics of a depth-map and its
registered texture image and have
demonstrated their state-of-the-art
performance. The current limitation is that
we have assumed the accurate registration
of the texture image and have not
assumed the presence of sensor noise. In
future work, we will evaluate our method’s
robustness to these problems to assess its
handling of more practical situations.
Range image expansion and inpainting. (a and d) LR images
with missing data for the apple and birdhouse datasets. (b
and e) Interpolated images with missing data. (c and f) Range
expansion with inpainting using the proposed method. 3D
reconstructions with light-rendering and gray-scale
representation, respectively, for (g and h) apple and (i and j)
birdhouse.
Range expansion with inpainting across
different objects. (a) Interpolated range
observation. (b) Corresponding HR and
inpainted range output using the
proposed method. (c–e) Unlinked, Linked
and residual edge maps, respectively,
which are used to restrict the smoothness
across edges.
Effect of noise on edge-linking. (a)
Noisy observation. (e) Corresponding
HR and inpainted output (b–d)
Unlinked, linked and residual edges
when no noise is added in the
observation. (f–h) Unlinked, linked and
residual edges for the observation in
(a).
Depth MAP Joint Superresolution-Inpainting #2
Superpixel-based depth map enhancement
and hole filling for view interpolation
Proceedings Volume 10420, Ninth International Conference on Digital
Image Processing (ICDIP 2017); 104202O (2017)
http://dx.doi.org/10.1117/12.2281544
Depth enhancement with improved exemplar-
based inpainting and joint trilateral guided filtering
Liang Zhang ; Peiyi Shen ; Shu'e Zhang ; Juan Song ; Guangming Zhu
Image Processing (ICIP), 2016 IEEE
https://doi.org/10.1109/ICIP.2016.7533131
Superpixel-based initial depth
map refinement: (a) superpixel
segmentation of the color image,
(b) initial depth map
segmentation using the same
superpixel label as (a), (c) initial
depth map before refinement, (d)
enhanced depth map of (c).
Superpixel-based warped depth map
hole filling: (a) and (b) are superpixels with
hole regions, (c) and (d) are hole filling
results of (a) and (b), respectively.
In this paper, we propose an efficient superpixel-based depth
information processing method for view interpolation. First of
all, the color image is segmented into superpixels using SLIC
algorithm, and the associated initial depth map is segmented
with the same label. After that, the depth-missing pixels are
recovered by considering the color and depth superpixels
jointly. Furthermore, the holes caused by disocclusion in the
warped depth map can also be filled in superpixel domain.
Experimental results demonstrate that with the incorporation
of the proposed initial depth map enhancement and warped
depth map hole filling method, better view interpolation
performances have been achieved.
Future
Deep learning for 2D-
based IMAGE restoration
Inspiration for
non-euclidean 3D images?
Image restoration Loss functions & Quality metrics #1A
Loss Functions for Image Restoration With
Neural Networks
Hang Zhao ; Orazio Gallo ; Iuri Frosio ; Jan Kautz NVIDIA, MIT Media Lab
IEEE Transactions on Computational Imaging ( Volume: 3, Issue: 1, March 2017 )
https://doi.org/10.1109/TCI.2016.2644865
The loss layer, despite being the effective driver of the
network’s learning, has attracted little attention within the
image processing research community: the choice of the
cost function generally defaults to the squared l2
norm of
the error [Jain et al. 2009; Burger et al. 2012;
Dong et al. 2014; Wang 2014]. This is understandable, given
the many desirable properties this norm possesses. There
is also a less well-founded, but just as relevant reason for
the continued popularity of l2
: standard neural networks
packages, such as Caffe, only offer the implementation for
this metric.
However, l2
suffers from well-known limitations. For
instance, when the task at hand involves image quality,
correlates poorly with image quality as perceived by a
human observer [Zhang et al. 2012]. This is because of a
number of assumptions implicitly made when using l2
.
First and foremost, the use of l2
assumes that the impact
of noise is independent of the local characteristics of the
image. On the contrary, the sensitivity of the Human
Visual System (HVS) to noise depends on local
luminance, contrast, and structure [Wang et al. 2004]. The
l2
loss also works under the assumption of white
Gaussian noise, which is not valid in general [e.g.
Wang, and Bovik 2009].
We focus on the use of neural networks for image restoration tasks, and we study the effect of different metrics for the network’s loss
layer. We compare l2
against four error metrics on representative tasks: image super-resolution, JPEG artifacts removal, and joint
denoising plus demosaicking. First, we test whether a different local metric such as l1
can produce better results. We then evaluate
the impact of perceptually-motivated metrics. We use two state-of-the-art metrics for image quality: the structural similarity index
(SSIM [Wang et al. 2004]) and the multiscale structural similarity index (MS-SSIM [Wang et al. 2003]). We choose these among the
plethora of existing indexes, because they are established measures, and because they are differentiable—a requirement for the
backpropagation stage. As expected, on the use cases we consider, the perceptual metrics outperform l2
. However, and perhaps
surprisingly, this is also true for l1
, see Figure 1. Inspired by this observation, we propose a novel loss function and show its superior
performance in terms of all the metrics we consider.
Image restoration Loss functions & Quality metrics #1b
However, it is widely accepted that l2
, and
consequently the Peak Signal-to-Noise Ratio,
PSNR, do not correlate well with human’s
perception of image quality l2
simply does not
capture the intricate characteristics of the human
visual system (HVS).
There exists a rich literature of error measures,
both reference-based and non reference-based,
that attempt to address the limitations of the
simple l2
error function. For our purposes, we focus
on reference-based measures. A popular
reference-based index is the structural similarity
index (SSIM). SSIM evaluates images accounting
for the fact that the HVS is sensitive to changes in
local structure. Wang et al. 2003 extend SSIM
observing that the scale at which local structure
should be analyzed is a function of factors such as
image-to-observer distance. To account for these
factors, they propose MS-SSIM, a multi-scale
version of SSIM that weighs SSIM computed at
different scales according to the sensitivity of the
HVS. Experimental results have shown the
superiority of SSIM-based indexes over l2
. As a
consequence, SSIM has been widely employed as
a metric to evaluate image processing algorithms.
Moreover, given that it can be used as a
differentiable cost function, SSIM has also been
used in iterative algorithms designed for image
compression [Wang, and Bovik 2009], image
reconstruction [Brunet et al. 2010], denoising and
super-resolution [Rehman et al. 2012], and even
downscaling [Öztireli and Gross 2015]. To the best
of our knowledge, however, SSIM-based indexes
have never been adopted to train neural
networks.
Recently, novel image quality indexes
based on the properties of the HVS
showed improved performance when
compared to SSIM and MS-SSIM. One
of these is the Information Weigthed
SSIM (IW-SSIM), a modification of MS-
SSIM that also includes a weighting
scheme proportional to the local image
information [Wang and Li 2011]. Another
is the Visual Information Fidelity
(VIF), which is based on the amount of
shared information between the
reference and distorted image [
Sheikh and Bovik 2006]. The Gradient
Magnitude Similarity Deviation
(GMSD) is characterized by simplified
math and performance similar to that of
SSIM, but it requires computing the
standard deviation over the whole
image [Xue et al. 2014]. Finally, the
Feature Similarity Index (FSIM),
leverages the perceptual importance of
phase congruency, and measures the
dissimilarity between two images based
on local phase congruency and
gradient magnitude [Zhang et al. 2011].
FSIM has also been extended to FSIMc,
which can be used with color images.
Despite the fact that they offer an
improved accuracy in terms of image
quality, the mathematical formulation of
these indexes is generally more
complex than SSIM and MS-SSIM, and
possibly not differentiable, making their
adoption for optimization procedures
not immediate.
Point cloud transformations
Numerical geometry of non-rigid shapes
Michael Bronstein
http://slideplayer.com/slide/4925779/
Left Intrinsic vs. Extrinsic properties of shapes. Top left: Original shape. Top Right: Reconstructed shape from geometry image with cut edges displayed in red. The middle and bottom
rows show the geometry image encoding the y coordinates and HKS, respectively of two spherical parameterizations (left and right). The two spherical parameterizations are
symmetrically rotated by 180 degrees along the Y-axis. The geometry images for Y-coordinate display an axial as well as intensity flip. Whereas, the geometry images for HKS only
display an axial flip. This is because HKS is an intrinsic shape signature (geodesics are persevered) whereas point coordinates on a shape surface are not. Center Intrinsic descriptors
(here the HKS) are invariant to shape articulations. Right Padding structure of geometry images: The geometry images for the 3 coordinates are replicated to produce a 3× 3 grid. The
center image in each grid corresponds to the original geometry image. Observe no discontinuities exist along the grid edges. Sinha et al. (2016)
Left Geometry images created by fixing the polar axis of a hand (top) and aeroplane (bottom), and rotating the
spherical parametrization by equal intervals along the axis. The cut is highlighted in red. Center Four rotated
geometry images for a different cut location highlighted in red. The plots to the right show padded geometry
images wherein the similarity across rotated geometry images are more evident and the five finger features
coherently visible Right Changing the viewing direction for a cut inverts the geometry image. The similarity in
geometry images for the two diametrically opposite cuts emerges when we pad the image in a 3×3 grid
Sinha et al.(2016)
Authalic vs Conformal parametrization: (Left to right) 2500
vertices of the hand mesh are color coded in the first two plots.
A 64× 64 geometry image is created by uniformly sampling a
parametrization, and then interpolating the nearby feature
values. Authalic geometry image encodes all tip features.
Conformal parametrization compress high curvature points to
dense regions [Gu et al. 2003]. Hence, finger tips are all
mapped to a very small regions. The fourth plot shows that the
resolution of geometry image is insufficient to capture the tip
feature colors in conformal parametrization. This is validated by
reconstructing shape from geometry images encoding x, y, z
locations for both parameterizations in final two plots.
2D super-resolution techniques for Geometry images
MemNet: A Persistent Memory
Network for Image Restoration
Ying Tai, Jian Yang, Xiaoming Liu, Chunyan Xu
(Submitted on 7 Aug 2017)
https://arxiv.org/abs/1708.02209
https://github.com/tyshiwo/MemNet
The same MemNet structure achieves the state-of-the-art performance in image denoising, super-resolution and
JPEG deblocking. Due to the strong learning ability, our MemNet can be trained to handle different levels of
corruption even using a single model.
CVAE-GAN: Fine-Grained Image
Generation through Asymmetric
Training
Jianmin Bao, Dong Chen, Fang Wen, Houqiang Li, Gang Hua
(Submitted on 29 Mar 2017)
https://arxiv.org/abs/1703.10155
https://github.com/tatsy/keras-generative
The proposed method can support a
wide variety of applications, including
image generation, attribute morphing,
image inpainting, and data
augmentation for training better face
recognition models
Surfaces segmentation and correspondence
Convolutional Neural Networks on
Surfaces via Seamless Toric Covers
Haggai Maron, Meirav Galun, Noam Aigerman, Miri Trope, Nadav Dym Ersin Yumer, Vladimir G.
Kim, Yaron Lipman | Weizmann Institute of Science, Adobe Research
ACM Transactions on Graphics (TOG)
Volume 36 Issue 4, July 2017 Article No. 71
http://dx.doi.org/10.1145/3072959.3073616
Parameterization produced by the geometry image method
of [Sinha et al. 2016]; the parameterization is not seamless
as the isolines break at the dashed image boundary (right);
although the parameterization preserves area it produces
large variability in shape.
Computing the flat-torus structure (middle) on a 4-cover of a
spheretype surface (le!) defined by prescribing three points
(colored disks). The right inset shows the flat-torus resulted from a
di#erent triplet choice
Visualization of “easy”
functions on the surface (top-
row) and their pushed version
on the flat-torus (bottom-row).
We show three examples of
functions we use as input to the
network: (a) average geodesic
distance (left), (b) the x
component of the surface
normal (middle), and (c) Wave
Kernel Signature [
Aubry et al. 2011]. The blowup
shows the face area, illustrating
that the input functions
capture relevant information in
the shape.
Experiments show that our method is able to learn
and generalize semantic functions better than state
of the art geometric learning approaches in
segmentation tasks. Furthermore, it can use only
basic local data (Euclidean coordinates, curvature,
normals) to achieve high success rate, demonstrating
ability to learn high-level features from a low-level
signal. This is the key advantage of defining a local
translation invariant convolution operator. Finally, it is
easy to implement and is fully compatible with current
standard CNN implementations for images.
A limitation of our technique is that it assumes the input
shape is a mesh with a sphere-like topology. An interesting
direction for future work is extending our method to
meshes with arbitrary topologies. This problem is
especially interesting since in certain cases shapes from
the same semantic class may have different genus.
Another limitation is that currently aggregation is done as
a separate post-process step and not as a part of the
CNN optimization. An interesting future work in this regard
is to incorporate the aggregation in the learning stage and
produce end-to-end learning framework.
Future
Deep learning for
Non-EUCLIDEAN
IMAGE restoration?
Point clouds Classification and segmentation
PointNet++: Deep Hierarchical Feature
Learning on Point Sets in a Metric Space
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas | Stanford University
(Submitted on 7 Jun 2017)
https://arxiv.org/abs/1706.02413
https://github.com/charlesq34/pointnet2 (TensorFlow)
Shapes in SHREC15 are 2D surfaces embedded in 3D space. Geodesic
distances along the surfaces naturally induce a metric space. We show through
experiments that adopting PointNet++ in this metric space is an effective way to
capture intrinsic structure of the underlying point set
We follow Rustamov et al. (2009) to obtain an
embedding metric that mimics geodesic
distance. Next we extract intrinsic point
features in this metric space including Wave
Kernel Signature (WKS) [Aubry et al. 2011],
Heat Kernel Signature (HKS) [Sun et al. 2009]
and multi-scale Gaussian curvature [
Meyer et al. 2003].
We use these features as input and then
sample and group points according to the
underlying metric space. In this way, our
network learns to capture multi-scale intrinsic
structure that is not influenced by the specific
pose of a shape. Alternative design choices
include using XYZ coordinates as points
feature or use Euclidean space R3
as the
underlying metric space. We show below
these are not optimal choices.
Aubry et al. 2011
Aubry et al. 2011
Aubry et al. 2011
Aubry et al. 2011
Point clouds Novel descriptors
Learning Compact Geometric Features
Marc Khoury, Qian-Yi Zhou, Vladlen Koltun
(Submitted on 15 Sep 2017)
https://arxiv.org/abs/1706.02413
We present an approach to learning features that represent the
local geometry around a point in an unstructured point cloud.
Such features play a central role in geometric registration, which
supports diverse applications in robotics and 3D vision.
The presented approach yields a family of features,
parameterized by dimension, that are both more compact and
more accurate than existing descriptors.
Background The development of geometric descriptors for rigid alignment of unstructured point clouds 
dates back to the 90s. Classic descriptors include Spin Images  [Johnson and Hebert 1999]
 and 3D Shape Context [
Frome et al. 2004]
. More recent work introduced Point Feature Histograms (PFH) [Rusu et al. 2008]
, Fast Point Feature 
Histograms  (FPFH)  [Rusu et al. 2009]
,  Signature  of  Histogram  Orientations  (SHOT)  [Salti et al. 2014]
,  and  Unique 
Shape Contexts (USC) [Tombari et al. 2010]
. 
A comprehensive evaluation of existing local geometric descriptors is reported by Guo et al. 2016
The learned descriptor is both more precise and
more compact than handcrafted features. Due to
its Euclidean structure, the learned descriptor
can be used as a drop-in replacement for
existing features in robotics, 3D vision, and
computer graphics applications. We expect
future work to further improve precision,
compactness, and robustness, possibly using
new approaches to optimizing feature
embeddings [Ustinova and Lempitsky 2016,
https://github.com/madkn/HistogramLoss,
https://youtu.be/_N1qYrv321E].
Dense Grid Point clouds generative model
Learning Efficient Point Cloud Generation
for Dense 3D Object Reconstruction
Chen-Hsuan Lin, Chen Kong, Simon Lucey
(Submitted on 21 Jun 2017)
https://arxiv.org/abs/1706.07036
We use 2D convolutional operations to predict the 3D structure from multiple
viewpoints and jointly apply geometric reasoning with 2D projection optimization. We
introduce the pseudo-renderer, a differentiable module to approximate the true
rendering operation, to synthesize novel depth maps for optimization. Experimental
results for single-image 3D object reconstruction tasks show that we outperforms state-
of-the-art methods in terms of shape similarity and prediction density.
Network architecture. From an encoded latent representation, we propose to use a
structure generator, which is based on 2D convolutional operations, to predict the 3D
structure at N viewpoints. The point clouds are fused by transforming the 3D structure at
each viewpoint to the canonical coordinates. The pseudo-renderer synthesizes depth
images from novel viewpoints, which are further used for joint 2D projection optimization.
This contains no learnable parameters and reasons based purely on 3D geometry
Concept of pseudo-rendering. Multiple transformed 3D points may correspond to projection on the same pixels in the image
space. (a) Collision could easily occur if were directly discretized. (b) Upsampling the target image increases the precision of the
projection locations and thus alleviates the collision effect. A max-pooling operation on the inverse depth values follows as to obtain
the original resolution while maintaining the effective depth value at each pixel. (c) Examples of pseudo-rendered depth images
with various upsampling factors U (only valid depth values without collision are shown). Pseudo-rendering achieves closer
performance to true rendering with a higher value of U.
Point clouds GAN #1A
Representation Learning and Adversarial Generation of 3D Point Clouds
Panos Achlioptas, Olga Diamanti, Ioannis Mitliagkas, Leonidas Guibas Same last author as for PointNet++
(Submitted on 8 Jul 2017)
https://arxiv.org/abs/1707.02392
Editing parts in point clouds using vector arithmetic on the autoencoder (AE) latent space. Left to right: tuning the appearance of cars towards the shape of convertibles, adding armrests to chairs, removing handle from mug.
We build an end-to-end pipeline for 3D point
clouds that uses an AE to create a latent
representation, and a GAN to generate new
samples in that latent space. Our AE is
designed with a structural loss tailored to
unordered point clouds. Our learned latent
space, while compact, has excellent class-
discriminative ability: per our classification
results, it outperforms recent GAN-based
representations by 4.3%. In addition, the latent
space allows for vector arithmetic, which we
apply in a number of shape editing scenarios,
such as interpolation and structural
manipulation
We argue that jointly learning the
representation and training the
GAN is unnecessary for our
modality. We propose a
workflow that first learns a
representation by training an AE
with a compact bottleneck layer,
then trains a plain GAN in that
fixed latent representation. One
benefit of this approach is that
AEs are a mature technology:
training them is much easier and
they are compatible with more
architecturesthan GANs.
We point to theory [ArjovskyandBottou.2017]
that
supports this idea, and verify it empirically:
we show that GANs trained in our learned
AE-based latent space generate visibly
improved results, even with a generator and
discriminator as shallow as a single hidden
layer. Within a handful of epochs, we
generate geometries that are recognized in
their right object class at a rate close to that
of ground truth data. Importantly, we report
significantly better diversity measures (10x
divergence reduction) over the state of the
art, establishing that we cover more of the
original data distribution. In summary, we
contribute
● An effective cross-category AE-
based latent representation on
point clouds.
● The first (monolithic) GAN
architecture operating on 3D point
clouds.
● A surprisingly simpler, state-of-the-
art GAN working in the AE’s latent
space.
Point clouds GAN #1B
Raw point cloud GAN (r-
GAN). The first version of
our generative model
operates directly on the
raw 2048 × 3 point set
input
512-dimensional
noise vector
Finally, training a GAN in the latent space is much
faster and much more stable. The inset provides
some intuition with a toy example, where the data
live in a 1D circular manifold. The density in red
is the result of training a GAN’s generator in the
original, 2D, data space. The most commonly
used GAN objectives are equivalent to
minimizing the Jensen-Shannon divergence
(JSD) between the generator and data
distributions. Unfortunately, the JSD is part of a
family of divergences that become unbounded
when there is support mismatch, which is the
case in the example: the GAN places a lot of
mass outside the data manifold. On the other
hand, when training a small GAN in the fixed
latent space of a trained AE (blue), the overlap
of the two distributions increases significantly.
According to recent theoretical advances
[Arjovskyand Bottou. 2017] this should improve stability.
Latent-space GAN (l-GAN). In our latentspace GAN (here, l-GAN ), instead of operating on the raw point cloud input, we pass the data through our pre-trained autoencoder,
trained separately for each object class with the earth mover's distance (EMD) loss function. Both the generator and the discriminator of the GAN then operate on the 512-
dimensional bottleneck variable of the AE. Finally, once the GAN training is over, the output of the generator is decoded to a point cloud via the AE decoder. The architecture for
the l-GAN is significantly simpler than the one of the r-GAN. We found that very shallow designs for both the generator and discriminator (in our case, 1 hidden layer for the
generator and 2 for the discriminator) are sufficient to produce realistic results.
An interesting avenue for future work involves further
exploring the idea of ingesting point clouds by sorting them
lexicographically before applying a 1D convolution. A
possibly interesting extension would be to study different
1D orderings that capture locality differently, e.g. Hilbert
curves (also known as a Hilbert space-filling curve). We can
also aim for convolution operators of higher order (2D and
3D)

More Related Content

What's hot

Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural NetworksPyData
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural NetworksYogendra Tamang
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyNUPUR YADAV
 
Understanding neural radiance fields
Understanding neural radiance fieldsUnderstanding neural radiance fields
Understanding neural radiance fieldsVarun Bhaseen
 
Deep learning for medical imaging
Deep learning for medical imagingDeep learning for medical imaging
Deep learning for medical imaginggeetachauhan
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learningAntonio Rueda-Toicen
 
A Beginner's Guide to Monocular Depth Estimation
A Beginner's Guide to Monocular Depth EstimationA Beginner's Guide to Monocular Depth Estimation
A Beginner's Guide to Monocular Depth EstimationRyo Takahashi
 
"Getting More from Your Datasets: Data Augmentation, Annotation and Generativ...
"Getting More from Your Datasets: Data Augmentation, Annotation and Generativ..."Getting More from Your Datasets: Data Augmentation, Annotation and Generativ...
"Getting More from Your Datasets: Data Augmentation, Annotation and Generativ...Edge AI and Vision Alliance
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learningleopauly
 
Computer Vision with Deep Learning
Computer Vision with Deep LearningComputer Vision with Deep Learning
Computer Vision with Deep LearningCapgemini
 
Deep Belief Networks
Deep Belief NetworksDeep Belief Networks
Deep Belief NetworksHasan H Topcu
 
Scene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani WithanawasamScene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani WithanawasamWithTheBest
 
HANDWRITTEN DIGIT RECOGNITIONppt1.pptx
HANDWRITTEN DIGIT RECOGNITIONppt1.pptxHANDWRITTEN DIGIT RECOGNITIONppt1.pptx
HANDWRITTEN DIGIT RECOGNITIONppt1.pptxALLADURGAUMESHCHANDR
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation岳華 杜
 
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation..."Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...Edge AI and Vision Alliance
 
3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving IIYu Huang
 

What's hot (20)

Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural Networks
 
Data Augmentation
Data AugmentationData Augmentation
Data Augmentation
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
Depth estimation using deep learning
Depth estimation using deep learningDepth estimation using deep learning
Depth estimation using deep learning
 
Understanding neural radiance fields
Understanding neural radiance fieldsUnderstanding neural radiance fields
Understanding neural radiance fields
 
Deep learning for medical imaging
Deep learning for medical imagingDeep learning for medical imaging
Deep learning for medical imaging
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
A Beginner's Guide to Monocular Depth Estimation
A Beginner's Guide to Monocular Depth EstimationA Beginner's Guide to Monocular Depth Estimation
A Beginner's Guide to Monocular Depth Estimation
 
"Getting More from Your Datasets: Data Augmentation, Annotation and Generativ...
"Getting More from Your Datasets: Data Augmentation, Annotation and Generativ..."Getting More from Your Datasets: Data Augmentation, Annotation and Generativ...
"Getting More from Your Datasets: Data Augmentation, Annotation and Generativ...
 
Computer Vision Crash Course
Computer Vision Crash CourseComputer Vision Crash Course
Computer Vision Crash Course
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Computer Vision with Deep Learning
Computer Vision with Deep LearningComputer Vision with Deep Learning
Computer Vision with Deep Learning
 
Deep Belief Networks
Deep Belief NetworksDeep Belief Networks
Deep Belief Networks
 
Video Classification Basic
Video Classification Basic Video Classification Basic
Video Classification Basic
 
Scene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani WithanawasamScene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani Withanawasam
 
HANDWRITTEN DIGIT RECOGNITIONppt1.pptx
HANDWRITTEN DIGIT RECOGNITIONppt1.pptxHANDWRITTEN DIGIT RECOGNITIONppt1.pptx
HANDWRITTEN DIGIT RECOGNITIONppt1.pptx
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
 
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation..."Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
 
3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II
 

Similar to Image Restoration for 3D Computer Vision

Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTechEmerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTechPetteriTeikariPhD
 
Deep Learning for Structure-from-Motion (SfM)
Deep Learning for Structure-from-Motion (SfM)Deep Learning for Structure-from-Motion (SfM)
Deep Learning for Structure-from-Motion (SfM)PetteriTeikariPhD
 
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...PetteriTeikariPhD
 
Dj31514517
Dj31514517Dj31514517
Dj31514517IJMER
 
Dj31514517
Dj31514517Dj31514517
Dj31514517IJMER
 
Object Detetcion using SSD-MobileNet
Object Detetcion using SSD-MobileNetObject Detetcion using SSD-MobileNet
Object Detetcion using SSD-MobileNetIRJET Journal
 
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...ijtsrd
 
Integrated Hidden Markov Model and Kalman Filter for Online Object Tracking
Integrated Hidden Markov Model and Kalman Filter for Online Object TrackingIntegrated Hidden Markov Model and Kalman Filter for Online Object Tracking
Integrated Hidden Markov Model and Kalman Filter for Online Object Trackingijsrd.com
 
A Pattern Classification Based approach for Blur Classification
A Pattern Classification Based approach for Blur ClassificationA Pattern Classification Based approach for Blur Classification
A Pattern Classification Based approach for Blur Classificationijeei-iaes
 
Light Field Technology
Light Field TechnologyLight Field Technology
Light Field TechnologyJeffrey Funk
 
Purkinje imaging for crystalline lens density measurement
Purkinje imaging for crystalline lens density measurementPurkinje imaging for crystalline lens density measurement
Purkinje imaging for crystalline lens density measurementPetteriTeikariPhD
 
Remote Sensing Image Scene Classification
Remote Sensing Image Scene ClassificationRemote Sensing Image Scene Classification
Remote Sensing Image Scene ClassificationGaurav Singh
 
Development and Hardware Implementation of an Efficient Algorithm for Cloud D...
Development and Hardware Implementation of an Efficient Algorithm for Cloud D...Development and Hardware Implementation of an Efficient Algorithm for Cloud D...
Development and Hardware Implementation of an Efficient Algorithm for Cloud D...sipij
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
s41598-023-28094-1.pdf
s41598-023-28094-1.pdfs41598-023-28094-1.pdf
s41598-023-28094-1.pdfarchurssu
 
An ensemble classification algorithm for hyperspectral images
An ensemble classification algorithm for hyperspectral imagesAn ensemble classification algorithm for hyperspectral images
An ensemble classification algorithm for hyperspectral imagessipij
 
RADAR Image Fusion Using Wavelet Transform
RADAR Image Fusion Using Wavelet TransformRADAR Image Fusion Using Wavelet Transform
RADAR Image Fusion Using Wavelet TransformINFOGAIN PUBLICATION
 
Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013
Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013
Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013Dariolakis
 

Similar to Image Restoration for 3D Computer Vision (20)

Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTechEmerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
 
Deep Learning for Structure-from-Motion (SfM)
Deep Learning for Structure-from-Motion (SfM)Deep Learning for Structure-from-Motion (SfM)
Deep Learning for Structure-from-Motion (SfM)
 
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...
 
Dj31514517
Dj31514517Dj31514517
Dj31514517
 
Dj31514517
Dj31514517Dj31514517
Dj31514517
 
Object Detetcion using SSD-MobileNet
Object Detetcion using SSD-MobileNetObject Detetcion using SSD-MobileNet
Object Detetcion using SSD-MobileNet
 
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
 
Integrated Hidden Markov Model and Kalman Filter for Online Object Tracking
Integrated Hidden Markov Model and Kalman Filter for Online Object TrackingIntegrated Hidden Markov Model and Kalman Filter for Online Object Tracking
Integrated Hidden Markov Model and Kalman Filter for Online Object Tracking
 
A Pattern Classification Based approach for Blur Classification
A Pattern Classification Based approach for Blur ClassificationA Pattern Classification Based approach for Blur Classification
A Pattern Classification Based approach for Blur Classification
 
Matlab abstract 2016
Matlab abstract 2016Matlab abstract 2016
Matlab abstract 2016
 
Light Field Technology
Light Field TechnologyLight Field Technology
Light Field Technology
 
Purkinje imaging for crystalline lens density measurement
Purkinje imaging for crystalline lens density measurementPurkinje imaging for crystalline lens density measurement
Purkinje imaging for crystalline lens density measurement
 
Remote Sensing Image Scene Classification
Remote Sensing Image Scene ClassificationRemote Sensing Image Scene Classification
Remote Sensing Image Scene Classification
 
Development and Hardware Implementation of an Efficient Algorithm for Cloud D...
Development and Hardware Implementation of an Efficient Algorithm for Cloud D...Development and Hardware Implementation of an Efficient Algorithm for Cloud D...
Development and Hardware Implementation of an Efficient Algorithm for Cloud D...
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
G04743943
G04743943G04743943
G04743943
 
s41598-023-28094-1.pdf
s41598-023-28094-1.pdfs41598-023-28094-1.pdf
s41598-023-28094-1.pdf
 
An ensemble classification algorithm for hyperspectral images
An ensemble classification algorithm for hyperspectral imagesAn ensemble classification algorithm for hyperspectral images
An ensemble classification algorithm for hyperspectral images
 
RADAR Image Fusion Using Wavelet Transform
RADAR Image Fusion Using Wavelet TransformRADAR Image Fusion Using Wavelet Transform
RADAR Image Fusion Using Wavelet Transform
 
Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013
Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013
Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013
 

More from PetteriTeikariPhD

ML and Signal Processing for Lung Sounds
ML and Signal Processing for Lung SoundsML and Signal Processing for Lung Sounds
ML and Signal Processing for Lung SoundsPetteriTeikariPhD
 
Next Gen Ophthalmic Imaging for Neurodegenerative Diseases and Oculomics
Next Gen Ophthalmic Imaging for Neurodegenerative Diseases and OculomicsNext Gen Ophthalmic Imaging for Neurodegenerative Diseases and Oculomics
Next Gen Ophthalmic Imaging for Neurodegenerative Diseases and OculomicsPetteriTeikariPhD
 
Wearable Continuous Acoustic Lung Sensing
Wearable Continuous Acoustic Lung SensingWearable Continuous Acoustic Lung Sensing
Wearable Continuous Acoustic Lung SensingPetteriTeikariPhD
 
Precision Medicine for personalized treatment of asthma
Precision Medicine for personalized treatment of asthmaPrecision Medicine for personalized treatment of asthma
Precision Medicine for personalized treatment of asthmaPetteriTeikariPhD
 
Two-Photon Microscopy Vasculature Segmentation
Two-Photon Microscopy Vasculature SegmentationTwo-Photon Microscopy Vasculature Segmentation
Two-Photon Microscopy Vasculature SegmentationPetteriTeikariPhD
 
Skin temperature as a proxy for core body temperature (CBT) and circadian phase
Skin temperature as a proxy for core body temperature (CBT) and circadian phaseSkin temperature as a proxy for core body temperature (CBT) and circadian phase
Skin temperature as a proxy for core body temperature (CBT) and circadian phasePetteriTeikariPhD
 
Summary of "Precision strength training: The future of strength training with...
Summary of "Precision strength training: The future of strength training with...Summary of "Precision strength training: The future of strength training with...
Summary of "Precision strength training: The future of strength training with...PetteriTeikariPhD
 
Precision strength training: The future of strength training with data-driven...
Precision strength training: The future of strength training with data-driven...Precision strength training: The future of strength training with data-driven...
Precision strength training: The future of strength training with data-driven...PetteriTeikariPhD
 
Intracerebral Hemorrhage (ICH): Understanding the CT imaging features
Intracerebral Hemorrhage (ICH): Understanding the CT imaging featuresIntracerebral Hemorrhage (ICH): Understanding the CT imaging features
Intracerebral Hemorrhage (ICH): Understanding the CT imaging featuresPetteriTeikariPhD
 
Hand Pose Tracking for Clinical Applications
Hand Pose Tracking for Clinical ApplicationsHand Pose Tracking for Clinical Applications
Hand Pose Tracking for Clinical ApplicationsPetteriTeikariPhD
 
Precision Physiotherapy & Sports Training: Part 1
Precision Physiotherapy & Sports Training: Part 1Precision Physiotherapy & Sports Training: Part 1
Precision Physiotherapy & Sports Training: Part 1PetteriTeikariPhD
 
Multimodal RGB-D+RF-based sensing for human movement analysis
Multimodal RGB-D+RF-based sensing for human movement analysisMultimodal RGB-D+RF-based sensing for human movement analysis
Multimodal RGB-D+RF-based sensing for human movement analysisPetteriTeikariPhD
 
Creativity as Science: What designers can learn from science and technology
Creativity as Science: What designers can learn from science and technologyCreativity as Science: What designers can learn from science and technology
Creativity as Science: What designers can learn from science and technologyPetteriTeikariPhD
 
Deep Learning for Biomedical Unstructured Time Series
Deep Learning for Biomedical  Unstructured Time SeriesDeep Learning for Biomedical  Unstructured Time Series
Deep Learning for Biomedical Unstructured Time SeriesPetteriTeikariPhD
 
Hyperspectral Retinal Imaging
Hyperspectral Retinal ImagingHyperspectral Retinal Imaging
Hyperspectral Retinal ImagingPetteriTeikariPhD
 
Instrumentation for in vivo intravital microscopy
Instrumentation for in vivo intravital microscopyInstrumentation for in vivo intravital microscopy
Instrumentation for in vivo intravital microscopyPetteriTeikariPhD
 
Future of Retinal Diagnostics
Future of Retinal DiagnosticsFuture of Retinal Diagnostics
Future of Retinal DiagnosticsPetteriTeikariPhD
 
OCT Monte Carlo & Deep Learning
OCT Monte Carlo & Deep LearningOCT Monte Carlo & Deep Learning
OCT Monte Carlo & Deep LearningPetteriTeikariPhD
 
Optical Designs for Fundus Cameras
Optical Designs for Fundus CamerasOptical Designs for Fundus Cameras
Optical Designs for Fundus CamerasPetteriTeikariPhD
 

More from PetteriTeikariPhD (20)

ML and Signal Processing for Lung Sounds
ML and Signal Processing for Lung SoundsML and Signal Processing for Lung Sounds
ML and Signal Processing for Lung Sounds
 
Next Gen Ophthalmic Imaging for Neurodegenerative Diseases and Oculomics
Next Gen Ophthalmic Imaging for Neurodegenerative Diseases and OculomicsNext Gen Ophthalmic Imaging for Neurodegenerative Diseases and Oculomics
Next Gen Ophthalmic Imaging for Neurodegenerative Diseases and Oculomics
 
Wearable Continuous Acoustic Lung Sensing
Wearable Continuous Acoustic Lung SensingWearable Continuous Acoustic Lung Sensing
Wearable Continuous Acoustic Lung Sensing
 
Precision Medicine for personalized treatment of asthma
Precision Medicine for personalized treatment of asthmaPrecision Medicine for personalized treatment of asthma
Precision Medicine for personalized treatment of asthma
 
Two-Photon Microscopy Vasculature Segmentation
Two-Photon Microscopy Vasculature SegmentationTwo-Photon Microscopy Vasculature Segmentation
Two-Photon Microscopy Vasculature Segmentation
 
Skin temperature as a proxy for core body temperature (CBT) and circadian phase
Skin temperature as a proxy for core body temperature (CBT) and circadian phaseSkin temperature as a proxy for core body temperature (CBT) and circadian phase
Skin temperature as a proxy for core body temperature (CBT) and circadian phase
 
Summary of "Precision strength training: The future of strength training with...
Summary of "Precision strength training: The future of strength training with...Summary of "Precision strength training: The future of strength training with...
Summary of "Precision strength training: The future of strength training with...
 
Precision strength training: The future of strength training with data-driven...
Precision strength training: The future of strength training with data-driven...Precision strength training: The future of strength training with data-driven...
Precision strength training: The future of strength training with data-driven...
 
Intracerebral Hemorrhage (ICH): Understanding the CT imaging features
Intracerebral Hemorrhage (ICH): Understanding the CT imaging featuresIntracerebral Hemorrhage (ICH): Understanding the CT imaging features
Intracerebral Hemorrhage (ICH): Understanding the CT imaging features
 
Hand Pose Tracking for Clinical Applications
Hand Pose Tracking for Clinical ApplicationsHand Pose Tracking for Clinical Applications
Hand Pose Tracking for Clinical Applications
 
Precision Physiotherapy & Sports Training: Part 1
Precision Physiotherapy & Sports Training: Part 1Precision Physiotherapy & Sports Training: Part 1
Precision Physiotherapy & Sports Training: Part 1
 
Multimodal RGB-D+RF-based sensing for human movement analysis
Multimodal RGB-D+RF-based sensing for human movement analysisMultimodal RGB-D+RF-based sensing for human movement analysis
Multimodal RGB-D+RF-based sensing for human movement analysis
 
Creativity as Science: What designers can learn from science and technology
Creativity as Science: What designers can learn from science and technologyCreativity as Science: What designers can learn from science and technology
Creativity as Science: What designers can learn from science and technology
 
Light Treatment Glasses
Light Treatment GlassesLight Treatment Glasses
Light Treatment Glasses
 
Deep Learning for Biomedical Unstructured Time Series
Deep Learning for Biomedical  Unstructured Time SeriesDeep Learning for Biomedical  Unstructured Time Series
Deep Learning for Biomedical Unstructured Time Series
 
Hyperspectral Retinal Imaging
Hyperspectral Retinal ImagingHyperspectral Retinal Imaging
Hyperspectral Retinal Imaging
 
Instrumentation for in vivo intravital microscopy
Instrumentation for in vivo intravital microscopyInstrumentation for in vivo intravital microscopy
Instrumentation for in vivo intravital microscopy
 
Future of Retinal Diagnostics
Future of Retinal DiagnosticsFuture of Retinal Diagnostics
Future of Retinal Diagnostics
 
OCT Monte Carlo & Deep Learning
OCT Monte Carlo & Deep LearningOCT Monte Carlo & Deep Learning
OCT Monte Carlo & Deep Learning
 
Optical Designs for Fundus Cameras
Optical Designs for Fundus CamerasOptical Designs for Fundus Cameras
Optical Designs for Fundus Cameras
 

Recently uploaded

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 

Recently uploaded (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

Image Restoration for 3D Computer Vision

  • 1.
  • 2. Deep Learning for Structure-from-Motion (SfM) https://www.slideshare.net/PetteriTeikariPhD/deconstructing-sfmnet Dataset creation for Deep Learning-based Geometric Computer Vision problems https://www.slideshare.net/PetteriTeikariPhD/dataset-creation-for-deep-learningbased-geometric-computer-visio n-problems Emerging 3D Scanning Technologies for PropTech https://www.slideshare.net/PetteriTeikariPhD/emerging-3d-scannng-techno logies-for-proptech Geometric Deep Learning https://www.slideshare.net/PetteriTeikariPhD/geometric-deep-learning
  • 4. Data structures for real estate scans RGB+D Pixelgrid presenting color and depth or 2.5D image Example from Prof.Li Mesh (Polygon) from voxel data(“3D pixels”) Voxel grid meshing using marching cubes (StackExchange) Point Cloud unordered datatypically (i.e. not on a grid but sparse pointson non- integercoordinates)
  • 5. Denoising remove noise from signal A spatially cohesive superpixel model for image noise level estimation Peng Fua, Changyang Li, Weidong Cai, Quansen Sun Neurocomputing Volume 266, 29 November 2017, Pages 420-432 https://doi.org/10.1016/j.neucom.2017.05.057 Superpixels generated by SCSM and SLIC from the noisy “Fish” image at various noise levels. (a)–(d) Original and noisy images; noise SD values of 10, 20, and 30 from (b) to (d), respectively. (e)–(h) Superpixels generated by SCSM from the corresponding images. (i)–(l) Superpixels generated by SLIC from the corresponding images. This paper proposes an automatic noise level estimation method. In contrast with the conventional rectangular- block division algorithms, the images are decomposed into superpixels that exhibit better adherence to the local image structures, thus generating a division into small regions that are more likely to be homogeneous. Moreover, the effective use of the spatial neighborhood information makes the SCSM more insensitive to image noise https://sites.google.com/site/pierrickcoupe/softwares/denoising-for-medical-imaging/mri-denoising http://dx.doi.org/10.1002/jmri.22003 https://youtu.be/5Y7yeRo5vGE Help selecting noise reduction plugin for Photoshop CC 2014 https://www.dpreview.com/forums /post/54065189 Imagenomic Noiseware, Neat Image, Nik Software DFine 2, Topaz DeNoise 5, NoiseNinja Especially with classical image processing algorithms, it is beneficial to reduce noise before applying the actual processing / analysis.
  • 6. DeConvolution / Deblurring Recent Progress in Image Deblurring Ruxin Wang, Dacheng Tao (Submitted on 24 Sep 2014) https://arxiv.org/abs/1409.6838 http://blogs.adobe.com/photoshop/2011/10/behind-all-the-buzz-deblur-sneak-peek.html Deconvolve the image with the Point Spread Function (PSF) that convolved the scene during image formation to sharpen the image
  • 7. edge-Aware Image smoothing Smooth constant patches while retaining sharp edges instead of “dumb Fourier low-pass filter” that destroys the edges Deep Edge-Aware Filters http://lxu.me/projects/deepeaf/ | http://proceedings.mlr.press/v37/xub15.html L0 smoothing BLF Bilateral Filter Our method is based on a deep convolutional neural network with a gradient domain training procedure, which gives rise to a powerful tool to approximate various filters without knowing the original models and implementation details. Efficient High-Dimensional, Edge-Aware Filtering | http://doi.org/10.1109/MCG.2016.119 Hui Huang, Shihao Wu, Minglun Gong, Daniel Cohen-Or, Uri Ascher, and Hao Zhang, "Edge-Aware Point Set Resampling," ACM Trans. on Graphics (presented at SIGGRAPH 2013), Volume 32, Number 1, Article 9, 2013. [PDF | Project page with source code https://doi.org/10.1145/2421636.2421645 The denoising capability of the blurring-sharpening strategy based on the tooth volume (mesh). (a-d) are obtained by adding one particular type of noise, as indicated by the corresponding captions. SNR (in dB) of the noisy and the smoothed volumes are shown in each figure.
  • 8. Super-resolution Depending on your background, super-resolution mean slightly different things https://www.ucl.ac.uk/super-resolution: Super-resolution imaging allows the imaging of fluorescently labelled probes at a resolution of just tens of nanometers, surpassing classic light microscopy by at least one order of magnitude. Recent advances such as the development of photo- switchable fluorophores, high-sensitivity microscopes and single-molecule localisation algorithms make super-resolution imaging rapidly accessible to the wider life sciences research community. At UCL we are currently taking a multidisciplinary effort to provide researchers access to super-resolution imaging systems. The Super-Resolution Facility (SuRF) currently features commercial systems supporting the PALM/STORM, SIM and STED super-resolution approaches. Beyond diffraction-limited Multiframe ‘Statistical upsampling’ e.g. deep learning http://www.infrared.avio.co.jp/en/products/ir-thermo/lineup/r500/index.html http://www.robots.ox.ac.uk/~vgg/research/SR/ https://techcrunch.com/2016/06/20/twitter-is-buying-magic-pony-te chnology-which-uses-neural-networks-to-improve-images/ Deep Learning for Isotropic Super-Resolution from Non-Isotropic 3D Electron Microscopy Larissa Heinrich, John A. Bogovic, Stephan Saalfeld HHMI Janelia Research Campus, Ashburn, USA https://arxiv.org/abs/1706.03142
  • 9. Geometrical super-resolution Both features extend over 3 pixels but in different amounts, enabling them to be localized with precision superior to pixel dimension Multi-exposure image noise reduction When an image is degraded by noise, there can be more detail in the average of many exposures, even within the diffraction limit. See example on the right. Single-frame deblurring Known defects in a given imaging situation, such as defocus or aberrations, can sometimes be mitigated in whole or in part by suitable spatial-frequency filtering of even a single image. Such procedures all stay within the diffraction- mandated passband, and do not extend it. Sub-pixel image localization The location of a single source can be determined by computing the "center of gravity" (centroid) of the light distribution extending over several adjacent pixels (see figure on the left). Provided that there is enough light, this can be achieved with arbitrary precision, very much better than pixel width of the detecting apparatus and the resolution limit for the decision of whether the source is single or double. This technique, which requires the presupposition that all the light comes from a single source, is at the basis of what has becomes known as superresolution microscopy, e.g. STORM, where fluorescent probes attached to molecules give nanoscale distance information. It is also the mechanism underlying visual hyperacuity. Bayesian induction beyond traditional diffraction limit Some object features, though beyond the diffraction limit, may be known to be associated with other object features that are within the limits and hence contained in the image. Then conclusions can be drawn, using statistical methods, from the available image data about the presence of the full object. The classical example is Toraldo di Francia's proposition of judging whether an image is that of a single or double star by determining whether its width exceeds the spread from a single star. This can be achieved at separations well below the classical resolution bounds, and requires the prior limitation to the choice "single or double?" The approach can take the form of extrapolating the image in the frequency domain, by assuming that the object is an analytic function, and that we can exactly know the function values in some interval. This method is severely limited by the ever-present noise in digital imaging systems, but it can work for radar, astronomy, microscopy or magnetic resonance imaging. More recently, a fast single image super-resolution algorithm based on a closed-form solution l2 problems has been proposed (Zheo et al. 2016) and demonstrated to accelerate most of the existing Bayesian super-resolution methods significantly. WIKIPEDIA: Detail-revealing Deep Video Super-resolution Xin Tao, Hongyun Gao, Renjie Liao, Jue Wang, Jiaya Jia (Submitted on 10 Apr 2017) https://arxiv.org/abs/1704.02738 Recent deep-learning-based video SR methods [Caballero et al. 2016; Kappeler et al. 2016] compensate inter-frame motion by aligning all other frames to the reference one, using backward warping. We show that such a seemingly reasonable technical choice is actually not optimal for video SR, and improving motion compensation can directly lead to higher quality SR results. In this paper, we achieve this by proposing a sub-pixel motion compensation (SPMC) strategy, which is validated by both theoretical analysis and extensive experiments.
  • 10. Optical ordiffractive super-resolution WIKIPEDIA: Substituting spatial-frequency bands. Though the bandwidth allowable by diffraction is fixed, it can be positioned anywhere in the spatial-frequency spectrum. Dark-field illumination in microscopy is an example. See also aperture synthesis. Multiplexing spatial-frequency bands such as structured illumination, An image is formed using the normal passband of the optical device. Then some known light structure, for example a set of light fringes that is also within the passband, is superimposed on the target. The image now contains components resulting from the combination of the target and the superimposed light structure, e.g. moiré fringes, and carries information about target detail which simple, unstructured illumination does not. The “superresolved” components, however, need disentangling to be revealed. Multiple parameter use within traditional diffraction limit If a target has no special polarization or wavelength properties, two polarization states or non-overlapping wavelength regions can be used to encode target details, one in a spatial- frequency band inside the cut-off limit the other beyond it. Both would utilize normal passband transmission but are then separately decoded to reconstitute target structure with extended resolution. Probing near-field electromagnetic disturbance The usual discussion of superresolution involved conventional imagery of an object by an optical system. But modern technology allows probing the electromagnetic disturbance within molecular distances of the source which has superior resolution properties, see also evanescent waves and the development of the new Super lens. Optical negative-index metamaterials Nature Photonics 1, 41 - 48 (2007) doi: 10.1038/nphoton.2006.49 | Cited by 2372 Sub–Diffraction-Limited Optical Imaging with a Silver Superlens Science 22 Apr 2005: Vol. 308, Issue 5721, pp. 534-537 doi: 10.1126/science.1108759 | Cited by 3219 articles Optical and acoustic metamaterials: superlens, negative refractive index and invisibility cloak Journal of Optics, Volume 19, Number 8 http://dx.doi.org/10.1088/2040-8986/aa7a1f → Special issue on the history of metamaterials http://zeiss-campus.magnet.fsu.edu/articles/superresolution/supersim.html
  • 11. Inpainting Paint over artifacts / missing values using surrounding pixels (“Clone Tool in Photoshop”), or more statistically using the same image (“Content-Aware Fill”), or bigger databases for example in deep learning pipelines The TUM-Image Inpainting Database Technische Universität München https://www.mmk.ei.tum.de/tumiid/ Context Encoders: Feature Learning by Inpainting (2016) Deepak Pathak, Phillip Krähenbühl, Jeff Donahue, Trevor Darrell, Alexei A. Efros http://people.eecs.berkel ey.edu/~pathak/context_en coder/ Improve your skin with Inpaint https://www.theinpaint.com/ Guillemot and Le Meur (2014) http://dx.doi.org/10.1109/MSP.2013.2273004 Yang et al. (2017) https://arxiv.org/abs/1611.09969
  • 13. Multiframe 2D super-resolution #1 A Unified Bayesian Approach to Multi-Frame Super-Resolution and Single-Image Upsampling in Multi-Sensor Imaging Thomas Köhler, Johannes Jordan, Andreas Maier and Joachim Hornegger Proceedings of the British Machine Vision Conference (BMVC), pages 143.1-143.12. BMVA Press, September 2015. https://dx.doi.org/10.5244/C.29.143 Robust Multiframe Super-Resolution Employing Iteratively Re-Weighted Minimization Thomas Köhler ; Xiaolin Huang ; Frank Schebesch ; André Aichert ; Andreas Maier ; Joachim Hornegger IEEE Transactions on Computational Imaging ( Volume: 2, Issue: 1, March 2016 ) https://doi.org/10.1109/TCI.2016.2516909 Future work should consider an adaption of our prior to blind super-resolution where the camera PSF is unknown or other image restoration problems, e. g. image deconvolution. In this work, we limited ourselves to non-blind super-resolution, where the PSF is assumed to be known. However, iteratively reweighted minimization could be augmented by blur estimation. Another promising extension is joint motion estimation and super- resolution, e. g. by using the nonlinear least squares algorithm. Conversely, blur and motion estimation can also benefit when using it in combination with our spatially adaptive model. One further direction of our future work is to make our approach adaptive to the scene content, e. g. by a local selection of the sparsity parameter p.
  • 15. Point cloud acquisition Guide to quickly build high-quality three-dimensional models with a structured light range scanner Bao-Quan Shi and Jin Liang OSA Applied Optics Vol. 55, Issue 36, pp. 10158-10169 (2016) https://doi.org/10.1364/AO.55.010158 [PDF] researchgate.net
  • 16. Multiframe techniques or multisweep techniques #1 High Fidelity Scan Merging Computer Graphics Forum July 2010 http://doi.org/10.1111/j.1467-8659.2010.01773.x For each scanned object 3D triangulation laser scanners deliver multiple sweeps corresponding to multiple laser motions and orientations. Scan integration as a labelling problem Pattern Recognition Volume 47, Issue 8, August 2014, Pages 2768-2782 https://doi.org/10.1016/j.patcog.2014.02.008 Example of overlapping scans. This head is such a complex structure that not less than 35 scans were acquired to fill in most holes. Example of two overlapping scans, points of each scanned are first meshed ((c)-(d)) separately. The result can be compared to the meshing of points of both scans together (d) Comparison of registration of two scans (colored in different colors on the top figure) using Global Non Rigid Alignment (middle) and scale space merging (bottom). Comparisons of the merging (a) with a level set (Poisson Reconstruction) reconstruction method of the unmerged scans point set (b) and a filtering of the unmerged scans point set (c). The level set method obviously introduces a serious smoothing, yet does not eliminate the scanning boundary lines. The bilateral filter, applied until all aliasing artifacts have been eliminated, over-smoothes some parts of the shape.
  • 17. Multiframe techniques or multisweep techniques #2 Density adaptive trilateral scan integration method Bao-Quan Shi and Jin Liang Applied Optics Vol. 54, Issue 19, pp. 5998-6009 (2015) https://doi.org/10.1364/AO.54.005998 Multi-Focus Image Fusion Via Coupled Sparse Representation and Dictionary Learning Rui Gao, Sergiy A. Vorobyov (Submitted on 30 May 2017) Aalto University, Dept. Signal Processing and Acoustics https://arxiv.org/abs/1705.10574 Standard pipelining of 3D modeling of commercial scanner XJTUOM Integration of 26 partially overlapping scans of a dice model. (a) SDF method. (b) Screened Poisson method. (c) Advancing front triangulation method. (d) K-means clustering method. (e) The new method. The new method is more robust to large gaps/registration errors than previous methods. Owing to the noise-removal property of the trilateral shifting procedure and mean-shift clustering algorithm, the new method produces much smoother surfaces.
  • 18. Multiframe techniques or multisweep techniques #3 Crossmodal point cloud registration in the Hough space for mobile laser scanning data Bence Gálai ; Balázs Nagy ; Csaba Benedek Pattern Recognition (ICPR), 2016 https://doi.org/10.1109/ICPR.2016.7900155 Top row: Point clouds of three different vehicle mounted Lidar systems (Velodyne HDL64 and VLP16 I3D scanners, and a Riegl VMX450 MMS), captured from the same scene at Fővám Tér, Budapest. Bottom row: segmentation results for each cloud by our proposed method
  • 19. Multiframe techniques or multisweep techniques #4 Frame Rate Fusion and Upsampling of EO/LIDAR Data for Multiple Platforms T. Nathan Mundhenk ; Kyungnam Kim ; Yuri Owechko Computer Vision and Pattern Recognition Workshops (CVPRW), 2014 IEEE https://doi.org/10.1109/CVPRW.2014.117 The left pane shows the PanDAR demonstrator sensors with the red Ladybug sensor mounted over the silver Velodyne 64E LIDAR. A custom aluminum scaffold connects the two sensors. The right pane shows the graphical interface with displays of the 3D model in the top, help menus and the depth map at the bottom. Multithreaded programing and GP-GPU methods allow us to obtain 10 fps with a Velodyne 64E LIDAR completely fused in 360° using a Ladybug panoramic camera. PanDAR: a wide-area, frame-rate, and full color lidar with foveated region using backfilling interpolation upsampling T. Nathan Mundhenk; Kyungnam Kim; Yuri Owechko Proceedings Volume 9406, Intelligent Robots and Computer Vision XXXII: Algorithms and Techniques; 94060K (2015) Event: SPIE/IS&T Electronic Imaging, 2015, San Francisco, California, United States http://dx.doi.org/10.1117/12.2078348
  • 20. Multiframe techniques or multisweep techniques #5 Upsampling method for sparse light detection and ranging using coregistered panoramic images Ruisheng Wang; Frank P. Ferrie J. of Applied Remote Sensing, 9(1), 095075 (2015) http://dx.doi.org/10.1117/1.JRS.9.095075 See-through problem and invalid light detection and ranging (LiDAR, Velodyne HDL-64E) points returned from building interior. (a) Camera image rendered from a certain viewpoint, (b) corresponding LiDAR image rendered from the same viewpoint of (a), (c) corresponding LiDAR image rendered from a top-down viewpoint “There are a number of improvements that are possible and are topics for future work. The initial depth ordering that used to determine visibility assumes a piecewise planar partition of the scene. While this can suffice for the urban environment considered here, a more general approach would consider a richer form of representation, e.g., using statistical modeling methods. Cues that are available in the coregistered intensity data, such as the loci of occluding contours, could also be exploited. At present, our interpolation strategy samples image space to determine connectivity and backprojects to 3-D, resulting in a nonuniform interpolation. A better solution would be to perform the sampling in 3-D by backprojecting the 2-D boundary and forming a 3-D bounding box that could then be interpolated at the desired resolution. In the limit, true multimodal analysis would consider the joint distribution of both intensity and depth information with the aim of inferring more detailed interpolation functions. With the availability of sophisticated platforms such as Navteq True, there is clearly an incentive to move in these directions.”
  • 21. Classical Image restoration for LASER Scans Laser Scanner Super-resolution https://doi.org/10.2312/SPBG/SPBG06/009-015 Cited by 47 articles
  • 22. Point cloud processing Reviews Point Cloud Processing Raphaële Héno andLaure Chandelier in 3D Modeling of Buildings. Chapter 5. (2014) http://doi.org/10.1002/9781118648889.ch5 A review of algorithms for filtering the 3D point cloud Signal Processing: Image Communication Volume 57, September 2017, Pages 103-112 https://doi.org/10.1016/j.image.2017.05.009 Octree structuring: point cloud and different levels of the hierarchical grid Example of significant noise on the profile view of a target: the standard deviation for a point cloud at target level is 8 mm A brief discussion of future research directions are presented as follows. 1) Combination of color and geometric information: For point clouds, especially these containing color information, a pure color or pure geometric attributes based method cannot work well. Hence, it is expected to combine the color and geometric information in the filtering process to further increase the performance of a filtering scheme. 2) Time complexity reduction: Because point clouds contain a large number of points, some of which can be up to hundreds of thousands or even millions of points, computation on these point clouds is time consuming. It is necessary to develop filtering technologies to filter point cloud effectively to reduce time complexity. 3) Filtering on point cloud sequence: Since object recognition from a point cloud sequence will become the future research direction. filtering the point cloud sequence will help to improve the performance and accuracy of object recognition.
  • 23. Point cloud processing SOFTWARE PCL Point Cloud Library (C++) http://pointclouds.org/ https://github.com/PointCloudLibrary MeshLab with beginner-friendly graphical front-end http://www.meshlab.net/ https://github.com/cnr-isti-vclab/meshlab CGAL Computational Geometry Algorithms Library (C++) https://www.cgal.org/ https://github.com/CGAL/cgal
  • 24. Point cloud denoising #1 Similarity based filtering of point clouds Julie Digne Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE https://doi.org/10.1109/CVPRW.2012.6238917 Photogrammetric DSM denoising Nex, F; Gerke, M. The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences; Gottingen XL.3: 231-238. Gottingen: Copernicus GmbH. (2014) http://dx.doi.org/10.5194/isprsarchives-XL-3-231-2014 Differences between ground truth and noisy DSM Photogrammetric Digital Surface Models (DSM) are usually affected by both random noise and gross errors. These errors are generally concentrated in correspondence of occluded or shadowed areas and are strongly influenced by the texture of the object that is considered, or the number of images employed for the matching. In the future, further tests will be performed on other real DSM in order to assess the reliability of the developed method in very different operative conditions. Then, the extension from the 2.5D case to the fully 3D will be performed and further comparisons with other available denoising algorithms will be performed, as well. In addition, a key feature of our method is that it is independent of a surface mesh: it can work directly on point clouds, which is useful, since building a mesh of a noisy point cloud is never easy, whereas building a mesh of a properly denoised shape is well understood. A possible extension for this work would be to use the filter as a projector onto the surface, in a spirit similar to [Lipman et al. 2007] for example.
  • 25. Point cloud denoising #2 Point Cloud Denoising via Moving RPCA E. Mattei, A. Castrodad Computer Graphics Forum (2106). doi: 10.1111/cgf.13068 Guided point cloud denoising via sharp feature skeletons The Visual Computer June 2017, Volume 33, Issue 6–8, pp 857–867 Yinglong Zheng, Guiqing Li, Shihao Wu, Yuxin Liu, Yuefang Gao https://doi.org/10.1007/s00371-017-1391-8 Denoising synthetic datasets of two planes meeting at increasingly shallow angles (20.4K points) with added Gaussian noise of standard deviation equal to 1% of the length of the bounding box diagonal. The two planes meet at an angle of 140º, 150º and 160º. The first and second rows show the noisy 3D data and 2D transects, respectively. Rows 3–5 show the results of the bilateral filter, AWLOP and MRPCA. Denoising of the Vienna cathedral SfM model. The noisy input was processed with MRPCA followed by a simple outlier removal method using Meshlab. Although MRPCA is robust against outliers, this robustness is achieved only locally. One simple modification to achieve global outlier robustness is to use an l1 data fitting term in problem (P6). Although the l1 -norm will be able to handle the global outliers better than the Frobenius norm used in this work, the computational cost will increase significantly. We point out that the size of the neighbourhoods is set globally. One improvement over the current method could be to make the neighbourhood size a function of the local point density. This could have a positive effect when handling datasets with spatially varying noise
  • 26. Point cloud Edge Detection #1 Fast and Robust Edge Extraction in Unorganized Point Clouds Dena Bazazian ; Josep R. Casas ; Javier Ruiz-Hidalgo Digital Image Computing: Techniques and Applications (DICTA), 2015 https://doi.org/10.1109/DICTA.2015.7371262
  • 27. Point cloud Edge Detection #2 Segmentation-based Multi-Scale Edge Extraction to Measure the Persistence of Features in Unorganized Point Clouds Dena Bazazian ; Josep R. Casas ; Javier Ruiz-Hidalgo "12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications". Porto: 2017, p. 317-325. http://dx.doi.org/10.5220/0006092503170325 Estimating the neighbors of a sample point on the ear of bunny at large scales: (a) far away neighbors may belong to foreign surfaces when Euclidean distance is used; (b) geodesic distance is a better choice to explore large local neighborhoods; (c) the point cloud can be segmented to distinguish different surfaces.
  • 28. Point cloud super-resolution #1 LidarBoost: Depth superresolution for ToF 3D shape scanning Sebastian Schuon ; Christian Theobalt ; James Davis ; Sebastian Thrun Computer Vision and Pattern Recognition, 2009. CVPR 2009 https://doi.org/10.1109/CVPR.2009.5206804 A new upsampling method for mobile LiDAR data Ruisheng Wang ; Jeff Bach ; Jane Macfarlane ; Frank P. Ferrie Applications of Computer Vision (WACV), 2012 IEEE https://doi.org/10.1109/WACV.2012.6162998 Real scene - wedges and panels (a): This scene with many depth edges (b) demonstrates the true resolution gain. Image-based super-resolution (IBSR) (c) demonstrates increased resolution at the edges, but some aliasing remains and the strong pattern in the interior persists. LidarBoost (d) reconstructs the edges much more clearly and there is hardly a trace of aliasing, also the depth layers visible in the red encircled area are better captured. Markov-Random-Field (MRF) upsampling (e) oversmooths the depth edges and in some places allows the low resolution aliasing to persist. The main contributions of this paper are ● A 3D depth sensor superresolution method that incorporates ToF specific knowledge and data. Additionally a new 3D shape prior is proposed, that enforces 3D specific properties. ● A comprehensive evaluation of the working range and accuracy of our algorithm using synthetic and real data captured with a ToF camera. ● Only few depth superresolution approaches have been developed previously. We show that our algorithm clearly outperforms the most related approaches.
  • 29. Point cloud super-resolution #2 Geometry Super-Resolution by Example Thales Vieira ; Alex Bordignon ; Thomas Lewiner ; Luiz Velho Computer Graphics and Image Processing (SIBGRAPI), 2009 https://doi.org/10.1109/SIBGRAPI.2009.10 Laser stripe model for sub-pixel peak detection in real-time 3D scanning Ingmar Besic ; Zikrija Avdagic Systems, Man, and Cybernetics (SMC), 2016 IEEE https://doi.org/10.1109/SMC.2016.7844912 Our tests show that noise does not vary significantly when observed on different color channels. Thus estimator algorithms can utilize any of color channels without sacrificing precision due to significantly increased noise. However if clever choice it to be made then estimator should opt for the green channel as it provides the most reliable stripe intensity data for both black and white surfaces across the whole modulation ROI. We have found that noise does not have continuous uniform distribution PDF, but normal distribution PDF and proposed a model that fits empirical data. Our measurements support assumption that the laser stripe image has approximately Gaussian intensity profile. However RMSE values show that single Gaussian curve fit is not the best choice as stripe intensity profile is superposed with surface reflections. After testing Gaussian fits from 1 to 8 we have concluded that models n > 2 are not suitable as they produce false peaks or subtract light intensity to achieve better RMSE. Thus we proposed Gaussian fit with two curves as optimal model based on the empirical data. Our future work is to target relations between coefficients of proposed laser stripe intensity profile and reduce their number if possible. Preliminary test show that mean values b1 and b2 of Gaussian curves tend to be equal or differ for subpixel amount. It is not yet clear if this difference has any significance or can be neglected. We also intend to test our model with different angles between laser source and target surface. The proposed method is limited to models which have repeated occurrences of a shape, and restrict the resolution increase to the regions of those occurrences. Increasing the resolution of other parts would require inpainting-like tools to extrapolate the geometry [Sharf et al. 2004], together with a superresolution scheme as an extension of the one proposed here.
  • 30. Point cloud super-resolution #3 Super-Resolution of Point Set Surfaces Using Local Similarities Azzouz Hamdi-Cherif,, Julie Digne, Raphaëlle Chaine Computer Graphics Forum 2017 http://dx.doi.org/10.1111/cgf.13216 Super-resolution of a single scan of the Maya point set. Left: initial scan, right: super-resolution. For visualization purposes, both are reconstructed with Poisson reconstruction Super-resolution of an input shape with highly repetitive geometric texture. (a) Underlying shape to be sampled by an acquisition device. (b) Low-resolved input sampling of the shape and local approximation with a quadric at each point; geometric texture is the residue over the quadric. (c) Super-resolved re-sampling using our method (fusion of super-resolved local patches). Right column: Generation of the super-resolved patches. (d) Construction of a local descriptor of the residue over a low-resolution grid corresponding to the unfolded quadric; blue points represent the height values estimated at bin centres, red points are the input points. (e) Similar descriptor points are added (orange points) to the input points (in red) of the local descriptor. (f) A super-resolved descriptor is computed from the set of red and orange points Super-resolution of a single scan of the Persepolis point set. Left: initial scan, right: super-resolution. The shape details appear much sharper after the super-resolution process. Parameters: r = 4 (Shape diagonal: 114), nbinslr = 64, nbinssr = 400 and rsim = 0.2.
  • 32. Point cloud Classification #2 Multi-class US traffic signs 3D recognition and localization via image-based point cloud model using color candidate extraction and texture- based recognition Vahid Balali, Arash Jahangiri and Sahar Ghanipoor Machiani Advanced Engineering Informatics Volume 32, April 2017, Pages 263-274 https://doi.org/10.1016/j.aei.2017.03.006 An improved Structure-from Motion (SfM) procedure is developed to create a clean 3D point cloud from the street level imagery and assist with accurate 3D localization by color and texture features extraction. The detected traffic signs are triangulated using camera pose information and their corresponding locations are visualized in 3D environment. The proposed method as shown in Fig. 1, mainly consists of three key components: 1) Detecting and classifying traffic signs using 2D images; 2) Reconstructing and automatically cleaning a 3D point cloud model; and 3) Recognizing and localizing traffic signs in 3D environment.
  • 33. Point cloud Clustering for simplification #1 Adaptive simplification of point cloud using k- means clustering Bao-Quan Shi, Jin Liang, Qing Liu Computer-Aided Design Volume 43, Issue 8, August 2011, Pages 910-922 https://doi.org/10.1016/j.cad.2011.04.001 A parallel point cloud clustering algorithm for subset segmentation and outlier detection Christian Teutsch , Erik Trostmann,, Dirk Berndt Proceedings Volume 8085, Videometrics, Range Imaging, and Applications XI; 808509 (2011) http://dx.doi.org/10.1117/12.888654 Cluster initialization of the Stanford bunny. Left: input data. Middle: initialization of the cluster centroids. Right: initial clusters are formed, and one cluster is shown in one color. If the noise of the 3D point cloud is serious, effective noise filtering should be conducted before the simplification. The proposed method can also simplify multiple 3D point clouds. Our future research will concentrate on simplifying multiple 3D point sets simultaneously For example, a point set with two million coordinates is analyzed within three seconds and 15 million points within 35 seconds on Intel Core2 processor. It handles arbitrary ndimensional data formats, e.g. with additional color and/or normal vector information since it is implemented as a template class. The algorithm is easy to parallelize which further increases the computation performance on multi-core machines for most applications. The feasibility of our clustering technique has been evaluated at the example of a variety of point clouds from different measuring applications and 3D scanning devices.
  • 34. Point cloud Object detection #1 Object Detection in Point Clouds Using Conformal Geometric Algebra Aksel Sveier, Adam Leon Kleppe, Lars Tingelstad and Olav Egeland Advances in Applied Clifford Algebras 2017 http://dx.doi.org/10.1007/s00006-017-0759-1 In this paper we focus on the detection of primitive geometric models in point clouds using RANSAC. A central step in the RANSAC algorithm is to classify inliers and outliers. We show that conformal geometric algebra (CGA) enable filters with geometrical interpretation for inlier/outlier classification. The last step of the RANSAC algorithm is fitting the primitive to its inliers. This can be performed analytically with CGA, and the method is identical for both planes and spheres. Setup of the robotic pick- and-place demonstration. Point clouds from the 3D camera is used for detecting the plane, spheres and cylinder. The information is sent to the robot arm, which is used to place the spheres in the cylinder Spheres were successfully detected in point clouds with up to 90% outliers and cylinders could successfully be detected in point clouds with up to 80% outliers. We suggested two methods for constructing a cylinder from point data using CGA and found that fitting two spheres to a cylinder gave performance advantages compared to constructing a circle and line from 3 points on the cylinder surface.
  • 35. Point cloud Compression static #1 Research on the Self-Similarity of Point Cloud Outline for Accurate Compression Xuandong An; Xiaoging Yu ; Yifan Zhang 2015 International Conference on Smart and Sustainable City and Big Data (ICSSC) http://dx.doi.org/10.1049/cp.2015.0272 The Lovers of Bordeaux (15.8 million points). Exploiting self-similarity in the model, we compress this representation down to 1.15 MB. The resulting model (right) is very close to the original one (left), as the reconstruction error is less than the laser scanner precision (0.02mm) for 99.14% of the input points. Point cloud compression approaches have mostly dealt with coordinates quantization via recursive space partitioning [Gandoin and Devillers 2002; Schnabel and Klein 2006; Huang et al. 2006; Smith et al. 2012]. In a nutshell, these approaches consist in inserting the points in a space partitioning data structure (e.g. octree, kd-tree) of given depth, and to replace them by the center of the cell they belong to. Self similarity of measured signals has gained interest over the past decade: research on signal processing as well as image processing has accomplished outstanding progress by taking advantage of the self-similarity of the measure. In the image processing field, the idea originated in the non-local means algorithm [Buades et al. 2005]: instead of denoising a pixel using its neighboring pixels, it is denoised by exploiting pixels of the whole image looking similar. The similarity between pixels is computed by comparing patches around them. Behind this powerful tool lies the idea that pixel far away from the considered area might entail information that will help processing it, because of the natural self- similarity of the image. Self-similarity of surfaces has mainly been exploited for surface denoising applications: the non local means filter has been adapted for surfaces be it meshes [Yoshizawa et al. 2006] or point clouds [ Adams et al. 2009; Digne 2012]. It was also used to define a Point Set Surface variant [ Guillemot et al. 2012] exhibiting better robustness to noise. Self-similarity of surfaces is obviously not limited to denoising purposes. For example, analyzing the similarity of a surface can lead to detect symmetries or repetition structures in surfaces [Mitra et al. 2006; Pauly et al. 2008]. An excellent survey of methods exploiting symmetry in shapes can be found in [Mitra et al. 2013]. There are several ways in which our compression scheme could be improved: ● Exploiting patch-based representation, artifacts may appear in case of boundaries, which could be dilated throughout decompression. One could mitigate this issue by adjusting the patch size (clipping some outer grid cells) along boundaries. This would require to store one small integer for each patch, at a small cost. ● Other seed picking strategies could be implemented, for example by placing the seeds so that they minimize the local error, in the spirit of [Ohtake et al. 2006]. ● Encoding per-point attribute such as normals and colors is possible with the same similarity-based coder. Perspectives: Although our algorithm is based on the exploitation of self-similarity on the whole surface, most of the involved treatments remain local. This is a good prospect for handling data of ever increasing size, using streaming processes. This is particularly important at a time when the geometric digitization campaigns sometimes cover entire cities.
  • 36. Point cloud Compression Dynamic #1 voxelized Motion-Compensated Compression of Dynamic Voxelized Point Clouds Ricardo L. de Queiroz ; Philip A. Chou IEEE Transactions on Image Processing ( Volume: 26, Issue: 8, Aug. 2017 ) https://doi.org/10.1109/TIP.2017.2707807 As a new concept for a new application, much has still to be fine tuned and perfected. For example, the post-processing (in-loop or otherwise) is far from reaching its peak performance. Both the morphological and the filtering operations are not well understood in this context. Similarly, the distortion metrics or the voxel matching methods are not developed to a satisfactory point. There is still plenty of work to be done to extend the present framework to use B-frames (bidirectional prediction) and to extend the GOF to a more typical IBBPBBP... format. Furthermore, we want to use adaptive block sizes, which are optimally selected in an RD sense and we also want to encode both the geometry and the color residues for the predicted (P and B) blocks. Finally, rather than re-using the correspondences from the surface reconstruction among consecutive frames, we want to develop efficient motion estimation methods for use with our coder. Each of these enhancements should improve the coder performance, such that there is a continuous sequence of improvements in this new frontier to be explored.
  • 37. Point cloud Compression Dynamic #2 Graph-Based Compression of Dynamic 3D Point Cloud Sequences Dorina Thanou ; Philip A. Chou ; Pascal Frossard IEEE Transactions on Image Processing ( Volume: 25, Issue: 4, April 2016 ) https://doi.org/10.1109/TIP.2016.2529506 Example of a point cloud of the ‘yellow dress’ sequence (a). The geometry is captured by a graph (b) and the r component of the color is considered as a signal on the graph (c). The size and the color of each disc indicate the value of the signal at the corresponding vertex. Octree decomposition of a 3D model for two different depth levels. The points belonging to each voxel are represented by the same color. There are a few directions that can be explored in the future. First, it has been shown in our experimental section that a significant part of the bit budget is spent for the compression of the 3D geometry, which given a particular depth of the octree, is lossless. A lossy compression scheme that permits some errors in the reconstruction of the geometry could bring non-negligible benefits in terms of the overall rate-distortion performance. Second, the optimal bit allocation between geometry, color and motion vector data stays an interesting and open research problem, due mainly to the lack of a suitable metric that balances geometry and color visual quality. Third, the estimation of the motion is done by computing features based on the spectral graph wavelet transform. Features based on data-driven dictionaries, such as the ones proposed in [Thanou et al. 2014], are expected to increase significantly the matching, and consequently the compression performance.
  • 38. Dynamic meshes laplace operator A 3D+t Laplace operator for temporal mesh sequences Victoria Fernández Abrevaya , Sandeep Manandhar, Franck Hétroy-Wheeler, Stefanie Wuhrer Computers & Graphics Volume 58, August 2016, Pages 12-22 https://doi.org/10.1016/j.cag.2016.05.018 In this paper we have introduced a discrete Laplace operator for temporally coherent mesh sequences. This operator is defined by modelling the sequences as CW complexes in a 4-dimensional Riemaniann space and using Discrete Exterior Calculus. A userdefined parameter is associated to the 4D space to control the influence of motionα with respect to the geometry. We have shown that this operator can be expressed by a sparse blockwise tridiagonal matrix, with a linear number of non zero coefficients with respect to the number of vertices in the sequence. The storage overhead with respect to frame-by-frame mesh processing is limited. We have also shown an application example, as-rigid-as-possible editing, for which it is relatively easy to extend the classical static Laplacian framework to mesh sequences with this matrix. Similar results to state-of-the- art methods can be reached with a simple, global formulation. This opens the possibility of many other problems in animation processing to be tackled the same way by taking advantage of the existing literature on the Laplacian operator for 3D meshes [Zhang et al. 2010]. In the future, we are in particular interested in studying the spectral properties of the defined discrete Laplace operator.
  • 39. Point cloud inpainting #1 Region of interest (ROI) based 3D inpainting Shankar Setty, Himanshu Shekhar, Uma Mudenagudi Proceeding SA '16 SIGGRAPH ASIA 2016 Posters Article No. 33 https://doi.org/10.1145/3005274.3005312 Point Cloud Data Cleaning and Refining for 3D As- Built Modeling of Built Infrastructure Abbas Rashidi and Ioannis Brilakis Construction Research Congress 2016 http://sci-hub.cc/10.1061/9780784479827.093 Future experiments will also be required to quantitatively measure the accuracy of the presented algorithms especially for the case of outliers’ removal. Developing robust algorithms for automatically recognizing 3D objects throughout the built infrastructure PCD and therefore enhancing the object oriented modeling stage is another possible direction for future research.
  • 40. Point cloud inpainting #2A Dynamic occlusion detection and inpainting of in situ captured terrestrial laser scanning point clouds sequence Chi Chen, Bisheng Yang IEEE Transactions on Image Processing ( Volume: 25, Issue: 4, April 2016 ) https://doi.org/10.1016/j.isprsjprs.2016.05.007 In future work, the proposed method will be extended to incorporate multiple geometric features (e.g. shape index, normal vector https://github.com/aboulch/normals_Hough ) of local point distributions to measure the geometric consistency in the background modeling stage, aiming for higher recall of the background points during inpainting.
  • 42. Point cloud Quality assessment #1 Towards Subjective Quality Assessment of Point Cloud Imaging in Augmented Reality Alexiou, Evangelos; Upenik, Evgeniy; Ebrahimi, Touradj IEEE 19th International Workshop on Multimedia Signal Processing, Luton Bedfordshire, United Kingdom, October 16-18, 2017 https://infoscience.epfl.ch/record/230115 On the performance of metrics to predict quality in point cloud representations Alexiou, Evangelos; Ebrahimi, Touradj SPIE Optics + Photonics for Sustainable Energy, San Diego, California, USA, August 6-10, 2017 https://infoscience.epfl.ch/record/230116 As it can be observed, our results show strong correlation between objective metrics and subjective scores in the presence of Gaussian noise. The statistical analysis shows that the current metrics perform well when Gaussian noise is introduced. However, in the presence of compression-like artifacts the performance is lesser for every type of content, leading to a conclusion that the performance is content dependent. Our results show that there is a need for better objective metrics that can more accurately predict all practical types of distortions for a wide variety of contents. absolute category rating (ACR) double-stimulus impairement scale (DSIS)
  • 43. Point cloud Quality assessment #2 A statistical method for geometry inspection from point clouds Francisco de Asís López, Celestino Ordóñez, Javier Roca-Pardiñas , Silverio García-Cortés Applied Mathematics and Computation Volume 242, 1 September 2014, Pages 562-568 https://doi.org/10.1016/j.amc.2014.05.130 Assessing planar asymmetries in shipbuilding from point clouds Javier Roca-Pardiñas , Celestino Ordõnez , Carlos Cabo , Agusín Menéndez-Díaz Measurement Volume 100, March 2017, Pages 252-261 https://doi.org/10.1016/j.measurement.2016.12.048 In this paper, a statistic test to perform geometry inspection is described. The methodology used allows, by means of bootstrapping techniques, to obtain a p-value for the statistical hypothesis established. An important aspect of the developed methodology, proved by means of a simulated experiment, is its capacity to control type I errors while it is able to reject the null hypothesis when it is false. This experiment showed that the performance of the method improves when the point density increases. The proposed method was applied to the inspection of a parabolic dish antenna, and the results show that it does not fit its theoretical shape, unless a 1 mm tolerance is admitted. It is noteworthy that although the method has been exposed as a global test for geometry inspection, it would also be possible to apply it to inspect different parts of the object under study. Yatch hull surface estimated from the point cloud.
  • 44. Point cloud Quality assessment #3 Defect detection Automated Change Diagnosis of Single- Column-Pier Bridges Based on 3D Imagery Data Ying Shi; Wen Xiong, Ph.D., P.E., M.ASCE; Vamsisai Kalasapudi; Chao Geng ASCE International Workshop on Computing in Civil Engineering 2017 http://doi.org/10.1061/9780784480830.012 The future work will include understanding the correlation between the deformation of the girder and column with the change in the thickness of the connected bearing. Such correlated change analysis will aid in understanding the cause of the observed thickness variation and performing reliable condition diagnosis of all the single pier bridges.
  • 45. Point cloud Quality assessment #4 with uncertainty Point cloud comparison under uncertainty. Application to beam bridge measurement with terrestrial laser scanning Francisco de Asís López, Celestino Ordóñez, Javier Roca-Pardiñas, Silverio García-Cortés Measurement Volume 51, May 2014, Pages 259-264 https://doi.org/10.1016/j.measurement.2014.02.013 Assessment of along-normal uncertainties for application to terrestrial laser scanning surveys of engineering structures Tarvo Mill, Artu Ellmann Survey Review (2017) Vol. 0 , Iss. 0,0 http://dx.doi.org/10.1080/00396265.2017.1361565 Future studies should more closely investigate the dependence of results of different TLS signal processing methods and also applicability of combined standard uncertainty (CSU, Bjerhammar 1973; Niemeier and Tengen 2017), equations considering also systematic error in TLS surveys. The application of the proposed methodology to compare two point clouds of a beam bridge measured with two different scanner systems, showed significant differences in parts of the beam. This is important in inspection works since different conclusions could be reached depending on the measuring instrument.
  • 46. PDE-based Point cloud processing Partial Difference Operators on Weighted Graphs for Image Processing on Surfaces and Point Clouds François Lozes ; Abderrahim Elmoataz ; Olivier Lézoray IEEE Transactions on Image Processing ( Volume: 23, Issue: 9, Sept. 2014 ) https://doi.org/10.1109/TIP.2014.2336548 PDE-Based Graph Signal Processing for 3-D Color Point Clouds : Opportunities for cultural heritage François Lozes ; Abderrahim Elmoataz ; Olivier Lézoray IEEE Signal Processing Magazine ( Volume: 32, Issue: 4, July 2015 ) https://doi.org/10.1109/MSP.2015.2408631 The approach allows processing of signal data on point clouds (e.g., spectral data, colors, coordinates, and curvatures). We have applied this approach for cultural heritage purposes on examples aimed at restoration, denoising, hole-filling, inpainting, object extraction, and object colorization.
  • 47. Sparse coding and point clouds #1 Cloud Dictionary: Sparse Coding and Modeling for Point Clouds Or Litany, Tal Remez, Alex Bronstein (Submitted on 15 Dec 2016 (v1), last revised 20 Mar 2017 (this version, v2)) https://arxiv.org/abs/1612.04956 Sparse Geometric Representation Through Local Shape Probing Julie Digne, Sébastien Valette, Raphaëlle Chaine (Submitted on 7 Dec 2016) https://arxiv.org/abs/1612.02261 With the development of range sensors such as LIDAR and time-of- flight cameras, 3D point cloud scans have become ubiquitous in computer vision applications, the most prominent ones being gesture recognition and autonomous driving. Parsimony-based algorithms have shown great success on images and videos where data points are sampled on a regular Cartesian grid. We propose an adaptation of these techniques to irregularly sampled signals by using continuous dictionaries. We present an example application in the form of point cloud denoising
  • 48. Building Information models (BIM) and point clouds An IFC schema extension and binary serialization format to efficiently integrate point cloud data into building models Thomas Krijnen, Jakob Beetz Advanced Engineering Informatics Available online 3 April 2017 https://doi.org/10.1016/j.aei.2017.03.008 Building elements, which can be represented by various forms of geometry, including 2D and 3D line drawings, Constructive Solid Geometry (CSG), Boundary Representations (BRep) and tessellated meshes. However, these three-dimensional representations are just one of the many aspects conveyed in an IFC model. In addition, attributes related to thermal or acoustic performance, costing or intended use of spaces etc. can be added. In many common data formats for the storage of point cloud data, such as E57 and PCD, metadata is attached to individual data sets. This metadata for example includes scanner positions or weather conditions that are perceived during the scan. From the acquisition process, the point data itself contains no grouping, decomposition or other information that relates the points to the semantic meaning of the real-world object that was scanned. In subsequent processing steps such labels are often added to the points. Several exchange formats, such as LAS, have options to store labels along with the points. The magnitude of the data which is typically found in point cloud data sets and IFC model populations can be dramatically different for the two file types. A meaningful IFC file can have file sizes in the order of a few megabytes, if geometrical representations and property values are properly reused and especially when the file contains implicit, parametric, rather than tessellated geometry. Depending on the amount of detail and precision, point cloud scans can easily amount to gigabytes of data. Despite the larger size, due the uniform structure and explicit nature, point clouds can typically be more immediately explored than IFC building models, for which the boolean operations and implicit geometries need to be evaluated prior to visualization. The need for a unified and harmonized storage model of the two data types is observed in literature [e.g. Li et al 2008; Golparvar-Fard et al. 2011]. Yet, the authors acknowledge that other use cases will exist in which a deep coupling between building models and point clouds is unnecessary or even undesirable. This paper presents an extension to the IFC schema by which an open andsemantically rich standard arises. Future: One of the core advantages of the HDF5 format is the usage of transparent block-level compression. HDF5 allows several compression schemes, including user-defined compression methods. These would allow much higher compression ratios by exploiting structural knowledge of the point cloud or by introducing additional lossiness in the compression methods. In the prototypical implementation only gzip compression is used. Especially the point clouds segments stored as height maps projected on parametric surfaces might be suitable for specific- purpose compression methods, such as jpeg or png, which can exploit and filter imperceivable differences. Lastly, future research will indicate how the associated point cloud structure presented in this paper can be paired with other spatial indexing structures to further advance the localized extraction of point cloud segments and spatial querying techniques. Further experiments will be conducted to harness and reuse the general purpose decomposition and aggregation relationships of the IFC to implement octrees and kd-trees to further enhance the structure and accessibility of the data.
  • 49. Dynamic Surface Mesh Detail enhancement #1 Multi-scale geometric detail enhancement for time-varying surfaces Graphical Models Volume 76, Issue 5, September 2014, Pages 413-425 https://doi.org/10.1016/j.gmod.2014.03.010 We first develop an adaptive spatio-temporal bilateral filter, which produces temporally- coherent and feature-preserving multi-scale representation for the time-varying surfaces. We then extract the geometric details from the time-varying surfaces, and enhance geometric details by exaggerating detail information at each scale across the time-varying surfaces. Velocity vectors estimation. The top row gives 4 frames in the time-varying surfaces, and the bottom row gives the corresponding velocity vectors for each frame Multi-scale representation and detail enhancement for time-varying surfaces. First row: Input time-varying surfaces, second row: multi-scale filtering results by filtering each frame individually, third rows: multi-scale filtering results using adaptive spatial– temporal filter, fourth and fifth rows: multi- scale detail enhancement results using 6 levels and 9 detail levels, respectively. Limitations: In our current detail transfer results, we only transfer the detail of a static model to time-varying surfaces. Our current algorithm cannot transfer the geometry detail of time-varying surfaces to target time-varying surfaces, which is challenging since it is difficult to build the corresponding mapping between the source and target time- varying surfaces with different surface frames. Another problem is that although our filtering and enhancement methods can alleviate the jittering artifacts, for input time-varying surfaces with heavy jittering, the jittering artifacts still cannot be removed completely. Processing surface sequences with heavy jittering is a very hard problem, which requires further sophisticated investigation.
  • 50. Surface reconstruction Data Priors #1 Surface reconstruction with data-driven exemplar priors Oussama Remil, Qian Xie, Xingyu Xie, Kai Xu, Jun Wang Computer-Aided Design Volume 88, July 2017, Pages 31-41 https://doi.org/10.1016/j.cad.2017.04.004 Given a noisy and sparse point cloud of structural complex mechanical part as input, our system produces the consolidated points by aligning exemplar priors learned from a mechanical shape database. With the additional information such as normals carried by our exemplar priors, our method achieves better feature preservation than direct reconstruction on the input point cloud (e.g., Poisson). An overview of our algorithm. We extract priors from a 3D shape database within the same category (e.g., mechanical parts) to construct a prior library. The affinity propagation clustering method is then performed on the prior library to obtain the set of representative priors, called the exemplar priors. Given an input point cloud, we construct its local neighborhoods and perform priors matching to find the similar exemplar prior to each local neighborhood. Subsequently, we utilize the matched exemplar priors to consolidate the input point scan through an augmentation procedure, with which we can generate the faithful surface where sharp features and fine details are well recovered. Limitations Our method is expected to behave well with different shape categories, meanwhile there are a few limitations that have to be discussed so far. Our algorithm fails when dealing with more challenging repositories with small number of redundant elements, such as complex organic shapes. In addition, if there are large holes within the input scans or big missing parts, our method may fail to complete them based on the “matching-to-alignment” strategy.
  • 51. Surface reconstruction Data Priors #2A 3D Reconstruction Supported by Gaussian Process Latent Variable Model Shape Priors Jens Krenzin, Olaf Hellwich PFG – Journal of Photogrammetry, Remote Sensing and Geoinformation Science May 2017, Volume 85, Issue 2, pp 97–112 https://doi.org/10.1007/s41064-017-0009-0 A 2D shape representing a filled circle, where black represents the outside of the object and white represents the inside of the object. b Corresponding signed distance function (SDF) for the shape shown in a. The 0-level is highlighted in red. c Discrete Cosine Transform (DCT) coefficients for the SDF shown in b. The first 15 DCT coefficients in each dimension store the important information about the shape. The remaining coefficients are nearly zero
  • 52. Surface reconstruction Data Priors #2B 3D Reconstruction Supported by Gaussian Process Latent Variable Model Shape Priors Jens Krenzin, Olaf Hellwich PFG – Journal of Photogrammetry, Remote Sensing and Geoinformation Science May 2017, Volume 85, Issue 2, pp 97–112 https://doi.org/10.1007/s41064-017-0009-0 Results for object A—cup. a Sample image. b Erroneous point cloud. c Ground truth. d Shape prior. e Corrected point cloud This article presents a method that removes outliers, reduces noise and fills holes in a point cloud using a learned shape prior. The shape prior is learned from a set of training objects using the GP-LVM. It has been shown that an interpolated shape between several training shapes often has ringing artefacts due to the DCT compression step. Several investigations were made on how these artefacts could be reduced. In the first investigation, the difference between the training shapes was reduced and the latent space became denser. As expected this reduced the Euclidean distance from one training example to the nearest training example. The closer two points are in the latent space, the more similar the corresponding shapes are. As a result of this the artefacts are reduced, but only slightly. In the second investigation, the DCT compression step was removed. The GP-LVM then learns a lower dimensional subspace directly on the SDF. It has been shown that this leads also to a slight reduction of the artefacts of the reconstructed shape, but the artefacts are still visible. In this work the GP-LVM was investigated as a candidate fulfilling the requirements. It has been shown that the number of shape parameters can be reduced, and that the model can be trained for specific object classes. Some of the experiments, related to model sparsity and well-behavedness, have discovered weaknesses of the presented method. Theseissues will be further investigated in future work.
  • 54. Depth MAP Inpainting #1 Kinect depth inpainting in real time Lucian Petrescu ; Anca Morar ; Florica Moldoveanu ; Alin Moldoveanu Telecommunications and Signal Processing (TSP), 2016 https://doi.org/10.1109/TSP.2016.7760974 Example of output from median filter: A) input depth map where black pixels are not sampled; B) output image after applying the median filter; C) difference between input and output: grayscale – sampled pixel, blue – inpainted; D) confidence: blue-filtered, white-sampled, red –unfiltered.
  • 55. Depth MAP Inpainting #2 A new method for inpainting of depth maps from time-of-flight sensors based on a modified closing by reconstruction algorithm Journal of Visual Communication and Image Representation Volume 47, August 2017, Pages 36-47 https://doi.org/10.1016/j.jvcir.2017.05.003 This procedure uses a modified morphological closing by reconstruction algorithm. Finally, the proposed method works properly in depth maps where there is a sufficient good definition of regions or at least the enough to be able to infer the missing information, e.g., depth maps obtained in indoor scenarios or acquired with sensors or methods that achieve these characteristics. Low- quality depth maps and those acquired in outdoor conditions may require additional pre-processing stages or even more robust methods because of the size of the holes presented in such images seems to be larger. Filling Kinect depth holes via position-guided matrix completion Zhongyuan Wang, Xiaowei Song , ShiZheng Wang, Jing Xiao, Rui Zhong, Ruimin Hu Neurocomputing Volume 215, 26 November 2016, Pages 48-52 https://doi.org/10.1016/j.neucom.2015.05.146
  • 56. Depth MAP Inpainting #3 Learning-based super-resolution with applications to intensity and depth images Haoheng Zheng, University of Wollongong, Doctor of Philosophy thesis, School of Electrical, Computer and Telecommunications Engineering, University of Wollongong, 2014. http://ro.uow.edu.au/theses/4284 Geometric Inpainting of 3D Structures Pratyush Sahay, A. N. Rajagopalan The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2015, pp. 1-7 https://doi.org/10.1109/CVPRW.2015.7301388 the proposed framework, albeit with occasional minor local artifacts. “Low-rank Theory” [184] Candes et al. (2011) “Robust principal component analysis?” Journal of the ACM (JACM) Volume 58 Issue 3, May 2011 https://doi.org/10.1145/1970392.1970395
  • 57. Depth MAP super-resolution #1 Depth map super resolution Murat Gevrekci ; Kubilay Pakin Image Processing (ICIP), 2011 18th IEEE https://doi.org/10.1109/ICIP.2011.6116454 Depth map acquisation with ToF camera with different integration times. Image on the left is captured with 10ms integration time. Note how the background depth information is noisy. Image on right is captured with 50ms integration. Background depth is captured reliably at the expense of saturating the near field with high integration time. We propose changing our constraint sets not only on a single range image but on differently exposed range images to increase depth resolution within whole work space. This concept has resemblance to High Dynamic Range (HDR) [Gevrekci and Gunturk 2007] image formation using differently exposed images. Proposed algorithm will merge useful depth information from different levels and eliminate contaminated data (i.e. saturation, noise). Modeling the imaging pipeline is a critical step in image enhancement as demonstrated by the author [Gevrekci and Gunturk 2005]. We propose modeling the depth map as a function of internal camera parameters, object and camera motion, and photometric changes due to camera response function and alternating integration time. Spatially Adaptive Tensor Total Variation-Tikhonov Model for Depth Image Super Resolution Gang Zhong ; Sen Xiang ; Peng Zhou ; Li Yu IEEE Access ( Volume: 5, 2017 ) https://doi.org/10.1109/ACCESS.2017.2715981 Visual comparison of 4× super resolution results on our synthetic scene Chess: (a) the groundtruth depth image. Super resolution results of (b) using Tikhonov regularization.(c) using total variation regularization. (d) using color guided tensor total variation regularization. (e) using fused edge map guided tensor total variation regularization. (f) using the spatially adaptive tensor total variation - Tikhonov regularization.
  • 58. Depth MAP super-resolution #2 Image-guided ToF depth upsampling: a survey Iván Eichhardt, Dmitry Chetverikov, Zsolt Jankó Machine Vision and Applications May 2017, Volume 28, Issue 3–4, pp 267–282 https://doi.org/10.1007/s00138-017-0831-9 Effect of imprecise calibration on depth upsampling. The discrepancy between the input depth and colour images is 2, 5 and 10 pixels, respectively Effect of optical radial distortion on depth upsampling
  • 59. Depth Map super-resolution #3 Super-resolution Reconstruction for Binocular 3D Data Wei-Tsung Hsiao ; Jing-Jang Leou ; Han-Hui Hsiao Pattern Recognition (ICPR), 2014 https://doi.org/10.1109/ICPR.2014.721 Depth Superresolution using Motion Adaptive Regularization Ulugbek S. Kamilov, Petros T. Boufounos (Submitted on 4 Mar 2016) https://arxiv.org/abs/1603.01633 Our motion adaptive method recovers a high- resolution depth sequence from high-resolution intensity and low-resolution depth sequences by imposing rank constraints on the depth patches: (a) and (b) t-y slices of the color and depth sequences, respectively, at a fixed x; (c)– (e) x-y slices at t1 = 10; (f)–(h) x-y slices at t2 = 40; (c) and (f) input color images; (d) and (g) input low-resolution and noisy depth images; (e) and (h) estimated depth images. Illustration of the block matching within a space-time search area. The area in the current frame t is centered at the reference patch. Search is also conducted in the same window position in multiple temporally adjacent frames. Similar patches are grouped together to construct a block p = Bp .β φ Visual evaluation on Road video sequence. Estimation of depth from its 3× downsized version at 30 dB input SNR. Row 1 shows the data at time instance t = 9. Row 2 shows the data at the time instance t = 47. Row 3 shows the t-y profile of the data at x = 64. Highlights indicate some of the areas where depth estimated by GDS-3D recovers details missing in the depth estimate of DS-3D that does not use intensity information.
  • 60. Depth Map super-resolution #4 Depth Map Restoration From Undersampled Data Srimanta Mandal ; Arnav Bhavsar ; Anil Kumar Sao IEEE Transactions on Image Processing ( Volume: 26, Issue: 1, Jan. 2017 ) https://doi.org/10.1109/TIP.2016.2621410 The objective of the paper: (a) Uniform up-sampling of an LR depth map i.e., filling up missing information in an HR grid generated from a uniformly sampled LR depth map – can be addressed by SR (b) Non-uniform up- sampling a sparse point cloud i.e., filling up the missing information in a randomly filled HR grid – can be addressed by PCC, (c) An extreme case of non-uniform up-sampling, where very less data is available. We suggest an approach wherein this is interpreted as non-uniform up-sampling followed by uniform up-sampling – can be addressed by PCC-SR. We have addressed the problem of depth restoration by up- sampling either the uniformly sampled LR depth map or sparse non-uniformly sampled point cloud in a unified sparse representation framework.
  • 61. Depth MAP Joint Superresolution-Inpainting #1 Range map superresolution-inpainting, and reconstruction from sparse data Computer Vision and Image Understanding Volume 116, Issue 4, April 2012, Pages 572-591 https://doi.org/10.1016/j.cviu.2011.12.005 Depth map inpainting and super-resolution based on internal statistics of geometry and appearance Satoshi Ikehata ; Ji-Ho Cho ; Kiyoharu Aizawa Image Processing (ICIP), 2013 20th IEEE https://doi.org/10.1109/ICIP.2013.6738194 In this paper, we have proposed depth- map inpainting and super-resolution algorithms which explicitly capture the internal statistics of a depth-map and its registered texture image and have demonstrated their state-of-the-art performance. The current limitation is that we have assumed the accurate registration of the texture image and have not assumed the presence of sensor noise. In future work, we will evaluate our method’s robustness to these problems to assess its handling of more practical situations. Range image expansion and inpainting. (a and d) LR images with missing data for the apple and birdhouse datasets. (b and e) Interpolated images with missing data. (c and f) Range expansion with inpainting using the proposed method. 3D reconstructions with light-rendering and gray-scale representation, respectively, for (g and h) apple and (i and j) birdhouse. Range expansion with inpainting across different objects. (a) Interpolated range observation. (b) Corresponding HR and inpainted range output using the proposed method. (c–e) Unlinked, Linked and residual edge maps, respectively, which are used to restrict the smoothness across edges. Effect of noise on edge-linking. (a) Noisy observation. (e) Corresponding HR and inpainted output (b–d) Unlinked, linked and residual edges when no noise is added in the observation. (f–h) Unlinked, linked and residual edges for the observation in (a).
  • 62. Depth MAP Joint Superresolution-Inpainting #2 Superpixel-based depth map enhancement and hole filling for view interpolation Proceedings Volume 10420, Ninth International Conference on Digital Image Processing (ICDIP 2017); 104202O (2017) http://dx.doi.org/10.1117/12.2281544 Depth enhancement with improved exemplar- based inpainting and joint trilateral guided filtering Liang Zhang ; Peiyi Shen ; Shu'e Zhang ; Juan Song ; Guangming Zhu Image Processing (ICIP), 2016 IEEE https://doi.org/10.1109/ICIP.2016.7533131 Superpixel-based initial depth map refinement: (a) superpixel segmentation of the color image, (b) initial depth map segmentation using the same superpixel label as (a), (c) initial depth map before refinement, (d) enhanced depth map of (c). Superpixel-based warped depth map hole filling: (a) and (b) are superpixels with hole regions, (c) and (d) are hole filling results of (a) and (b), respectively. In this paper, we propose an efficient superpixel-based depth information processing method for view interpolation. First of all, the color image is segmented into superpixels using SLIC algorithm, and the associated initial depth map is segmented with the same label. After that, the depth-missing pixels are recovered by considering the color and depth superpixels jointly. Furthermore, the holes caused by disocclusion in the warped depth map can also be filled in superpixel domain. Experimental results demonstrate that with the incorporation of the proposed initial depth map enhancement and warped depth map hole filling method, better view interpolation performances have been achieved.
  • 63. Future Deep learning for 2D- based IMAGE restoration Inspiration for non-euclidean 3D images?
  • 64. Image restoration Loss functions & Quality metrics #1A Loss Functions for Image Restoration With Neural Networks Hang Zhao ; Orazio Gallo ; Iuri Frosio ; Jan Kautz NVIDIA, MIT Media Lab IEEE Transactions on Computational Imaging ( Volume: 3, Issue: 1, March 2017 ) https://doi.org/10.1109/TCI.2016.2644865 The loss layer, despite being the effective driver of the network’s learning, has attracted little attention within the image processing research community: the choice of the cost function generally defaults to the squared l2 norm of the error [Jain et al. 2009; Burger et al. 2012; Dong et al. 2014; Wang 2014]. This is understandable, given the many desirable properties this norm possesses. There is also a less well-founded, but just as relevant reason for the continued popularity of l2 : standard neural networks packages, such as Caffe, only offer the implementation for this metric. However, l2 suffers from well-known limitations. For instance, when the task at hand involves image quality, correlates poorly with image quality as perceived by a human observer [Zhang et al. 2012]. This is because of a number of assumptions implicitly made when using l2 . First and foremost, the use of l2 assumes that the impact of noise is independent of the local characteristics of the image. On the contrary, the sensitivity of the Human Visual System (HVS) to noise depends on local luminance, contrast, and structure [Wang et al. 2004]. The l2 loss also works under the assumption of white Gaussian noise, which is not valid in general [e.g. Wang, and Bovik 2009]. We focus on the use of neural networks for image restoration tasks, and we study the effect of different metrics for the network’s loss layer. We compare l2 against four error metrics on representative tasks: image super-resolution, JPEG artifacts removal, and joint denoising plus demosaicking. First, we test whether a different local metric such as l1 can produce better results. We then evaluate the impact of perceptually-motivated metrics. We use two state-of-the-art metrics for image quality: the structural similarity index (SSIM [Wang et al. 2004]) and the multiscale structural similarity index (MS-SSIM [Wang et al. 2003]). We choose these among the plethora of existing indexes, because they are established measures, and because they are differentiable—a requirement for the backpropagation stage. As expected, on the use cases we consider, the perceptual metrics outperform l2 . However, and perhaps surprisingly, this is also true for l1 , see Figure 1. Inspired by this observation, we propose a novel loss function and show its superior performance in terms of all the metrics we consider.
  • 65. Image restoration Loss functions & Quality metrics #1b However, it is widely accepted that l2 , and consequently the Peak Signal-to-Noise Ratio, PSNR, do not correlate well with human’s perception of image quality l2 simply does not capture the intricate characteristics of the human visual system (HVS). There exists a rich literature of error measures, both reference-based and non reference-based, that attempt to address the limitations of the simple l2 error function. For our purposes, we focus on reference-based measures. A popular reference-based index is the structural similarity index (SSIM). SSIM evaluates images accounting for the fact that the HVS is sensitive to changes in local structure. Wang et al. 2003 extend SSIM observing that the scale at which local structure should be analyzed is a function of factors such as image-to-observer distance. To account for these factors, they propose MS-SSIM, a multi-scale version of SSIM that weighs SSIM computed at different scales according to the sensitivity of the HVS. Experimental results have shown the superiority of SSIM-based indexes over l2 . As a consequence, SSIM has been widely employed as a metric to evaluate image processing algorithms. Moreover, given that it can be used as a differentiable cost function, SSIM has also been used in iterative algorithms designed for image compression [Wang, and Bovik 2009], image reconstruction [Brunet et al. 2010], denoising and super-resolution [Rehman et al. 2012], and even downscaling [Öztireli and Gross 2015]. To the best of our knowledge, however, SSIM-based indexes have never been adopted to train neural networks. Recently, novel image quality indexes based on the properties of the HVS showed improved performance when compared to SSIM and MS-SSIM. One of these is the Information Weigthed SSIM (IW-SSIM), a modification of MS- SSIM that also includes a weighting scheme proportional to the local image information [Wang and Li 2011]. Another is the Visual Information Fidelity (VIF), which is based on the amount of shared information between the reference and distorted image [ Sheikh and Bovik 2006]. The Gradient Magnitude Similarity Deviation (GMSD) is characterized by simplified math and performance similar to that of SSIM, but it requires computing the standard deviation over the whole image [Xue et al. 2014]. Finally, the Feature Similarity Index (FSIM), leverages the perceptual importance of phase congruency, and measures the dissimilarity between two images based on local phase congruency and gradient magnitude [Zhang et al. 2011]. FSIM has also been extended to FSIMc, which can be used with color images. Despite the fact that they offer an improved accuracy in terms of image quality, the mathematical formulation of these indexes is generally more complex than SSIM and MS-SSIM, and possibly not differentiable, making their adoption for optimization procedures not immediate.
  • 66. Point cloud transformations Numerical geometry of non-rigid shapes Michael Bronstein http://slideplayer.com/slide/4925779/ Left Intrinsic vs. Extrinsic properties of shapes. Top left: Original shape. Top Right: Reconstructed shape from geometry image with cut edges displayed in red. The middle and bottom rows show the geometry image encoding the y coordinates and HKS, respectively of two spherical parameterizations (left and right). The two spherical parameterizations are symmetrically rotated by 180 degrees along the Y-axis. The geometry images for Y-coordinate display an axial as well as intensity flip. Whereas, the geometry images for HKS only display an axial flip. This is because HKS is an intrinsic shape signature (geodesics are persevered) whereas point coordinates on a shape surface are not. Center Intrinsic descriptors (here the HKS) are invariant to shape articulations. Right Padding structure of geometry images: The geometry images for the 3 coordinates are replicated to produce a 3× 3 grid. The center image in each grid corresponds to the original geometry image. Observe no discontinuities exist along the grid edges. Sinha et al. (2016) Left Geometry images created by fixing the polar axis of a hand (top) and aeroplane (bottom), and rotating the spherical parametrization by equal intervals along the axis. The cut is highlighted in red. Center Four rotated geometry images for a different cut location highlighted in red. The plots to the right show padded geometry images wherein the similarity across rotated geometry images are more evident and the five finger features coherently visible Right Changing the viewing direction for a cut inverts the geometry image. The similarity in geometry images for the two diametrically opposite cuts emerges when we pad the image in a 3×3 grid Sinha et al.(2016) Authalic vs Conformal parametrization: (Left to right) 2500 vertices of the hand mesh are color coded in the first two plots. A 64× 64 geometry image is created by uniformly sampling a parametrization, and then interpolating the nearby feature values. Authalic geometry image encodes all tip features. Conformal parametrization compress high curvature points to dense regions [Gu et al. 2003]. Hence, finger tips are all mapped to a very small regions. The fourth plot shows that the resolution of geometry image is insufficient to capture the tip feature colors in conformal parametrization. This is validated by reconstructing shape from geometry images encoding x, y, z locations for both parameterizations in final two plots.
  • 67. 2D super-resolution techniques for Geometry images MemNet: A Persistent Memory Network for Image Restoration Ying Tai, Jian Yang, Xiaoming Liu, Chunyan Xu (Submitted on 7 Aug 2017) https://arxiv.org/abs/1708.02209 https://github.com/tyshiwo/MemNet The same MemNet structure achieves the state-of-the-art performance in image denoising, super-resolution and JPEG deblocking. Due to the strong learning ability, our MemNet can be trained to handle different levels of corruption even using a single model. CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training Jianmin Bao, Dong Chen, Fang Wen, Houqiang Li, Gang Hua (Submitted on 29 Mar 2017) https://arxiv.org/abs/1703.10155 https://github.com/tatsy/keras-generative The proposed method can support a wide variety of applications, including image generation, attribute morphing, image inpainting, and data augmentation for training better face recognition models
  • 68. Surfaces segmentation and correspondence Convolutional Neural Networks on Surfaces via Seamless Toric Covers Haggai Maron, Meirav Galun, Noam Aigerman, Miri Trope, Nadav Dym Ersin Yumer, Vladimir G. Kim, Yaron Lipman | Weizmann Institute of Science, Adobe Research ACM Transactions on Graphics (TOG) Volume 36 Issue 4, July 2017 Article No. 71 http://dx.doi.org/10.1145/3072959.3073616 Parameterization produced by the geometry image method of [Sinha et al. 2016]; the parameterization is not seamless as the isolines break at the dashed image boundary (right); although the parameterization preserves area it produces large variability in shape. Computing the flat-torus structure (middle) on a 4-cover of a spheretype surface (le!) defined by prescribing three points (colored disks). The right inset shows the flat-torus resulted from a di#erent triplet choice Visualization of “easy” functions on the surface (top- row) and their pushed version on the flat-torus (bottom-row). We show three examples of functions we use as input to the network: (a) average geodesic distance (left), (b) the x component of the surface normal (middle), and (c) Wave Kernel Signature [ Aubry et al. 2011]. The blowup shows the face area, illustrating that the input functions capture relevant information in the shape. Experiments show that our method is able to learn and generalize semantic functions better than state of the art geometric learning approaches in segmentation tasks. Furthermore, it can use only basic local data (Euclidean coordinates, curvature, normals) to achieve high success rate, demonstrating ability to learn high-level features from a low-level signal. This is the key advantage of defining a local translation invariant convolution operator. Finally, it is easy to implement and is fully compatible with current standard CNN implementations for images. A limitation of our technique is that it assumes the input shape is a mesh with a sphere-like topology. An interesting direction for future work is extending our method to meshes with arbitrary topologies. This problem is especially interesting since in certain cases shapes from the same semantic class may have different genus. Another limitation is that currently aggregation is done as a separate post-process step and not as a part of the CNN optimization. An interesting future work in this regard is to incorporate the aggregation in the learning stage and produce end-to-end learning framework.
  • 70. Point clouds Classification and segmentation PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas | Stanford University (Submitted on 7 Jun 2017) https://arxiv.org/abs/1706.02413 https://github.com/charlesq34/pointnet2 (TensorFlow) Shapes in SHREC15 are 2D surfaces embedded in 3D space. Geodesic distances along the surfaces naturally induce a metric space. We show through experiments that adopting PointNet++ in this metric space is an effective way to capture intrinsic structure of the underlying point set We follow Rustamov et al. (2009) to obtain an embedding metric that mimics geodesic distance. Next we extract intrinsic point features in this metric space including Wave Kernel Signature (WKS) [Aubry et al. 2011], Heat Kernel Signature (HKS) [Sun et al. 2009] and multi-scale Gaussian curvature [ Meyer et al. 2003]. We use these features as input and then sample and group points according to the underlying metric space. In this way, our network learns to capture multi-scale intrinsic structure that is not influenced by the specific pose of a shape. Alternative design choices include using XYZ coordinates as points feature or use Euclidean space R3 as the underlying metric space. We show below these are not optimal choices. Aubry et al. 2011 Aubry et al. 2011 Aubry et al. 2011 Aubry et al. 2011
  • 71. Point clouds Novel descriptors Learning Compact Geometric Features Marc Khoury, Qian-Yi Zhou, Vladlen Koltun (Submitted on 15 Sep 2017) https://arxiv.org/abs/1706.02413 We present an approach to learning features that represent the local geometry around a point in an unstructured point cloud. Such features play a central role in geometric registration, which supports diverse applications in robotics and 3D vision. The presented approach yields a family of features, parameterized by dimension, that are both more compact and more accurate than existing descriptors. Background The development of geometric descriptors for rigid alignment of unstructured point clouds  dates back to the 90s. Classic descriptors include Spin Images  [Johnson and Hebert 1999]  and 3D Shape Context [ Frome et al. 2004] . More recent work introduced Point Feature Histograms (PFH) [Rusu et al. 2008] , Fast Point Feature  Histograms  (FPFH)  [Rusu et al. 2009] ,  Signature  of  Histogram  Orientations  (SHOT)  [Salti et al. 2014] ,  and  Unique  Shape Contexts (USC) [Tombari et al. 2010] .  A comprehensive evaluation of existing local geometric descriptors is reported by Guo et al. 2016 The learned descriptor is both more precise and more compact than handcrafted features. Due to its Euclidean structure, the learned descriptor can be used as a drop-in replacement for existing features in robotics, 3D vision, and computer graphics applications. We expect future work to further improve precision, compactness, and robustness, possibly using new approaches to optimizing feature embeddings [Ustinova and Lempitsky 2016, https://github.com/madkn/HistogramLoss, https://youtu.be/_N1qYrv321E].
  • 72. Dense Grid Point clouds generative model Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction Chen-Hsuan Lin, Chen Kong, Simon Lucey (Submitted on 21 Jun 2017) https://arxiv.org/abs/1706.07036 We use 2D convolutional operations to predict the 3D structure from multiple viewpoints and jointly apply geometric reasoning with 2D projection optimization. We introduce the pseudo-renderer, a differentiable module to approximate the true rendering operation, to synthesize novel depth maps for optimization. Experimental results for single-image 3D object reconstruction tasks show that we outperforms state- of-the-art methods in terms of shape similarity and prediction density. Network architecture. From an encoded latent representation, we propose to use a structure generator, which is based on 2D convolutional operations, to predict the 3D structure at N viewpoints. The point clouds are fused by transforming the 3D structure at each viewpoint to the canonical coordinates. The pseudo-renderer synthesizes depth images from novel viewpoints, which are further used for joint 2D projection optimization. This contains no learnable parameters and reasons based purely on 3D geometry Concept of pseudo-rendering. Multiple transformed 3D points may correspond to projection on the same pixels in the image space. (a) Collision could easily occur if were directly discretized. (b) Upsampling the target image increases the precision of the projection locations and thus alleviates the collision effect. A max-pooling operation on the inverse depth values follows as to obtain the original resolution while maintaining the effective depth value at each pixel. (c) Examples of pseudo-rendered depth images with various upsampling factors U (only valid depth values without collision are shown). Pseudo-rendering achieves closer performance to true rendering with a higher value of U.
  • 73. Point clouds GAN #1A Representation Learning and Adversarial Generation of 3D Point Clouds Panos Achlioptas, Olga Diamanti, Ioannis Mitliagkas, Leonidas Guibas Same last author as for PointNet++ (Submitted on 8 Jul 2017) https://arxiv.org/abs/1707.02392 Editing parts in point clouds using vector arithmetic on the autoencoder (AE) latent space. Left to right: tuning the appearance of cars towards the shape of convertibles, adding armrests to chairs, removing handle from mug. We build an end-to-end pipeline for 3D point clouds that uses an AE to create a latent representation, and a GAN to generate new samples in that latent space. Our AE is designed with a structural loss tailored to unordered point clouds. Our learned latent space, while compact, has excellent class- discriminative ability: per our classification results, it outperforms recent GAN-based representations by 4.3%. In addition, the latent space allows for vector arithmetic, which we apply in a number of shape editing scenarios, such as interpolation and structural manipulation We argue that jointly learning the representation and training the GAN is unnecessary for our modality. We propose a workflow that first learns a representation by training an AE with a compact bottleneck layer, then trains a plain GAN in that fixed latent representation. One benefit of this approach is that AEs are a mature technology: training them is much easier and they are compatible with more architecturesthan GANs. We point to theory [ArjovskyandBottou.2017] that supports this idea, and verify it empirically: we show that GANs trained in our learned AE-based latent space generate visibly improved results, even with a generator and discriminator as shallow as a single hidden layer. Within a handful of epochs, we generate geometries that are recognized in their right object class at a rate close to that of ground truth data. Importantly, we report significantly better diversity measures (10x divergence reduction) over the state of the art, establishing that we cover more of the original data distribution. In summary, we contribute ● An effective cross-category AE- based latent representation on point clouds. ● The first (monolithic) GAN architecture operating on 3D point clouds. ● A surprisingly simpler, state-of-the- art GAN working in the AE’s latent space.
  • 74. Point clouds GAN #1B Raw point cloud GAN (r- GAN). The first version of our generative model operates directly on the raw 2048 × 3 point set input 512-dimensional noise vector Finally, training a GAN in the latent space is much faster and much more stable. The inset provides some intuition with a toy example, where the data live in a 1D circular manifold. The density in red is the result of training a GAN’s generator in the original, 2D, data space. The most commonly used GAN objectives are equivalent to minimizing the Jensen-Shannon divergence (JSD) between the generator and data distributions. Unfortunately, the JSD is part of a family of divergences that become unbounded when there is support mismatch, which is the case in the example: the GAN places a lot of mass outside the data manifold. On the other hand, when training a small GAN in the fixed latent space of a trained AE (blue), the overlap of the two distributions increases significantly. According to recent theoretical advances [Arjovskyand Bottou. 2017] this should improve stability. Latent-space GAN (l-GAN). In our latentspace GAN (here, l-GAN ), instead of operating on the raw point cloud input, we pass the data through our pre-trained autoencoder, trained separately for each object class with the earth mover's distance (EMD) loss function. Both the generator and the discriminator of the GAN then operate on the 512- dimensional bottleneck variable of the AE. Finally, once the GAN training is over, the output of the generator is decoded to a point cloud via the AE decoder. The architecture for the l-GAN is significantly simpler than the one of the r-GAN. We found that very shallow designs for both the generator and discriminator (in our case, 1 hidden layer for the generator and 2 for the discriminator) are sufficient to produce realistic results. An interesting avenue for future work involves further exploring the idea of ingesting point clouds by sorting them lexicographically before applying a 1D convolution. A possibly interesting extension would be to study different 1D orderings that capture locality differently, e.g. Hilbert curves (also known as a Hilbert space-filling curve). We can also aim for convolution operators of higher order (2D and 3D)