Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Emerging 3D Scanning Technologies for PropTech
Falling costs with rising quality via hardware innovations and deep learning
Outlineofthepresentation
StructurefromMotion(SfM) Low-cost passive sensing
360°imaging Omnidirectional immersiveimagesandv...
Datastructuresfor realestatescans
RGB+D Pixel grid presenting colorand depth
Example
from Prof. Li
Mesh(Polygon) from voxe...
PropTechResources for domaininsights
https://www.inman.com/
Inman Hacker Connect is created by and for the real
estate tec...
StructurefromMotion(SfM)
Low-costpassivesensing
StructurefromMotionBasics
Structure-from-Motion (SfM). Instead of a
single stereo pair, the SfM technique requires
multipl...
StructurefromMotionLiteratureReferences
https://doi.org/10.1016/j.geomorph.2012.08.021
Cited by 631 articles, and see Rela...
SLAM Simultaneouslocalizationandmapping
SLAM, Visual Odometry, Structure from Motion, Multiple View Stereo
Yu Huang, Senio...
SLAM Traditionalalgorithm comparison
http://dx.doi.org/10.1186/s41074-017-0027-2
The framework is mainly composed of three...
VisualOdometry
Taketomi et al. (2017):
http://dx.doi.org/10.1186/s41074-017-0027-2
“Odometry is to estimate the sequential...
SoftwareOpen-sourceVisualSFM
VisualSFM:AVisualStructurefromMotion
System Changchang Wu
Cited by 326 articles, and see Rela...
SoftwarePythonPhotogrammetryToolbox(PPT)GUI
Real photo x SfM with texture color x SfM with simple shader. Made
with Python...
Open-sourcelibraries forSfM
OpenSfM is a Structure from Motion
library written in Python on top of
OpenCV. The library ser...
Open-sourcelibraries forSfM+SLAM
OpenChisel
https://github.com/personalrobotics/OpenChisel
An open-source version of the C...
Research-gradeSfM old-school monovideo
http://dx.doi.org/10.1186/s13640-017-0168-3
Inspired by the structure from motion s...
Research-gradeSfM DeepLearning -based#1
Research-gradeSfM DeepLearning -based#2
https://arxiv.org/abs/1702.01381, 2 May 2017
We evaluated the performance of our p...
Research-gradePose/Structure DeepLearning -based#1
Essentially the same technology for stereo matching and depth map gener...
Research-gradePose/Structure DeepLearning -based#2
GANs on everything, so here as well :) The usefulness of VisualSFM/ ope...
SfMonMobileDevices
https://arxiv.org/abs/1611.09498
https://doi.org/10.1109/ICCV.2013.15 | Cited by 141 articles, see Rela...
SfMonMobileDevices CaseDacuda
Magic Leap, the augmented reality
startup that has raised $1.4 billion in
funding but has ye...
AppleARKit Technology
https://developer.apple.com/arkit/
Since the iPhone 6, iPhones have used what Apple calls “Focus Pix...
AppleARKit ExampleApplications
https://twitter.com/madewithARKit
Measuring kitchen dimensions
http://bit.ly/2tJ5KV8 app by...
Google’s responsetoARKit ARCore
DAVID JAGNEUX, UPLOADVR@UPLOADVR SEPTEMBER 2, 2017 6:00 AM “Earlier this week, Google
anno...
ARCore tooearlytotellhowitwilldoagainst“AppleCult”
Verge Adi Robertson
https://youtu.be/NhJydpMkpug
FusedVR https://youtu....
DeepLearningonMobileDevices
https://techcrunch.com/2017/05/17/googles-tensorflow-lite-brings-machine-learning-to-android-d...
360°imaging
360°(omnidirectionalimaging) Introduction
The Panoptic Camera platform developed
jointly by Microelectronic Systems
Labora...
360°aspartof “10BreakthroughTechnologiesof2017”
https://www.technologyreview.com/s/603496/10-breakthrough-technologies-201...
Low-costendSamsung Gear andGalaxy
Samsung Gear360, ~£250
Samsung GearVR, ~£100
Samsung Galaxy S6-8, smartphone, ~£200-£700...
Low-costend#2Ricoh Theta
Ricoh’s Theta V 4K camera sports 360-
degree video and wireless playback
RYAN WINTERHALTER, UPLOA...
HigherEndGoPro, Nokia Ozo, FacebookSurround, etc.
GoPro (NASDAQ:GPRO) recently unveiled the Omni, a six-camera rig
for fil...
LytroImmerge The world'sfirst professional Light Field solution forcinematicVR
roadtovr.com/lytros-immerge-360
https://www...
“DepthLytro”‘Depth sensing with light fieldtechniques
Refocusing in spite of foreground occlusions: (a) Scene containing a...
Post-processingfor360° imaging
https://doi.org/10.1007/s00371-017-1368-7
Overall process. a Input image. b Lines detected ...
360°DeepLearning #1
http://dx.doi.org/10.3390/s17061341
https://arxiv.org/abs/1705.01759
Watching a 360º sports video
requ...
360°DeepLearning #2
Flat2Sphere: Learning Spherical Convolution for Fast Features from 360° Imagery
Yu-Chuan Su, Kristen G...
360°Therolein PropTech? #1a
Usefor real estate agents, still a novelty/gimmicky? (from 2014 until 2017)
MAY 26, 2014 By Ja...
360°Therolein PropTech? #1b
Usefor real estate agents
A four-wheeled tripod outfitted with a computer, 360-
degree camera ...
360°Therolein PropTech? #2a
Use forconstruction andasatoolforconstructing4D/5D/6DBIM (BuildingInformationModel)
Constructi...
360°Therolein PropTech? #2b
360 videos registered or not to 3D BIM model allows inspection of the progress (“4D BIM”) in t...
360°imaging+SfM
360°intosmartphones howbigwillitbe?
https://www.engadget.com/2017/07/10/future-of-smartphone-camera/
1) Augmented reality
...
360°intosmartphones plentyofoptionscoming#1
Acer’s new Holo 360 degree camera
is essentially a smartphone
Acer has announc...
360°intosmartphones plentyofoptionscoming#2
ProTruly’s Darling
https://www.theverge.com/2017/3/5/14809
182/protruly-darlin...
360°intosmartphones convergencewith AI players of course
https://www.embedded-vision.com/news/movidius-low-po
wer-vpu-tech...
360°VideoSfM Obviousextensiontocombineboth
Instead of manuallyrotatingyour camera,image all angles simultaneously while go...
360°VideoSfM KoreaAdvanced Institute ofScience andTechnology(KAIST)
Spherical panoramic cameras (Ricoh Theta S, Samsung
Ge...
RangeSensing
Structured-LightandTime-of-Flight
MicrosoftKinect Democratizing structuredlightscanning
https://arxiv.org/abs/1505.05459
Structured light A sequence of know...
KinectFusion Scanning with Kinect
https://doi.org/10.1145/2047196.2047270 Cited by 1356 articles, see Related articles
htt...
Kinecttweaks depthresolution improvementswithpolarization measurement?
http://news.mit.edu/2015/object-recognition-robots-...
RangeSensing PlentyofOptions
http://3dscanexpert.com/photogrammetry-benchmarks-r
emake-vs-photoscan-vs-realitycapture-vs-z...
Matterportdominating RealEstatescanning
This $4,500 camera turns the real world into the virtual one. Today, Matterport
’s...
MatterportResearch onsemanticindoor segmentation
We collected the data using the Matterport Camera, which combines 3
struc...
MatterportTechnologypatents
Capturing and aligning multiple 3-dimensional sceneswww.google.com/patents/US8879828Grant -
Fi...
GoogleTangoTechnology
http://www.deccanchronicle.com/technology/gadgets/210717/i
s-google-tango-relevant-in-2017.html
http...
GoogleTangoExampleApplications#1
We broke the news yesterday that Google
was producing a prototype 3D sensing
smartphone c...
GoogleTangoExampleApplications#2
Google Tango SDK
examples: how to
make a floor plan in
50 seconds
Alexander Grau
Google T...
“GoogleTango”withoutdepth sensors
I have always believed that bringing 3D to consumers could only work without the need fo...
AppleDepthSensing
TheiPhoneX’s
notch isbasically
aKinect
365by Paul Miller@futurepaul  Sep 17,2017, 10:00am
EDT
https://ww...
Laserscanning
LIDARtechnology
LaserScanning LiDAR(LightDetection AndRanging)
http://dx.doi.org/10.1038/nphoton.2010.148
http://dx.doi.org/10.1080/194798...
VelodyneThemoston newsduetoautonomousdriving
http://velodynelidar.com/
https://www.youtube.com/watch?v=8nTFjVm9sTQ https:/...
RieglA rangeof differentlaserscanners
http://www.riegl.com/products/unmanned-scanning/
RIEGL VZ-400 Indoor Scanned Data
by...
Rieglsystemin practice
https://doi.org/10.1109/IROS.2016.7759501
Namely, we propose a method for the automatic selection o...
HandheldScanning GeoSLAMZEB-REVO
Handheld Laser Scanning -
ZEB-REVO
The ZEB-REVO is the latest, lightweight
revolving lase...
GeoSlamvs.Leica Portablescanningquality
http://dx.doi.org/10.1117/12.2270761
The paper investigates the performances of tw...
ResearchScanners SensorFusion
The Indoor Multi-sensor Acquisition System
(IMAS) presented in this paper consists of a whee...
AppliedPointCloud Scans Accessibility
Point Clouds to Indoor/Outdoor Accessibility
Diagnosis
J. Balado, L. Díaz-Vilariño, ...
Post-processing
Rawpointcloudsaremassiveandpossiblycontain alotof
redundantdatapoints
DataQuality compromisebetweenfilesize,computationaltimeandquality
3D model reconstruction from point cloud processed eithe...
PointCloudLibray(PCL) The mostpopular open-sourcelibrary
http://unanancyowen.com/en/pcl-with-velodyne/
https://www.youtube...
Otherlibraries CGALandresearchcode
Driftcorrection forproperimageregistration
https://doi.org/10.1109/ROBOT.2010.5509312
Correcting for drift (distortion) be...
DataReduction andsimplificationfor storage
Imran Ashraf ; Soojung Hur ; Yongwan Park
https://doi.org/10.1109/ACCESS.2017.2...
DataReduction CompressiongPointClouds
Dynamic polygon cloud compression
Eduardo Pavez ; Philip A. Chou (2017)
https://doi....
Data-drivenprocessing
Likein allthefieldsofcomputervision,real-timescanning,post-
processingandsemanticunderstandingareimp...
DeepLearningbeyondnon-euclidean problems
Michael M. Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, andPierre Vandergheyn...
DeepLearningPointclouds
https://arxiv.org/abs/1704.03847
https://arxiv.org/abs/1705.03428
DeepLearningPointNet++
PointNet++: Deep Hierarchical Feature Learning on
Point Sets in a Metric Space
Charles R. Qi, Li Yi...
DeepLearning2DFeatureDescriptors
Instead of using the old-school SIFT, SURF, ORB, etc., the
feature descriptor / matching ...
DeepLearning3DFeatureDescriptors
https://arxiv.org/abs/1706.04496
We present a view-based convolutional network that produ...
Meshgenerativeshapeswith GAN
https://arxiv.org/abs/1705.02090
Our key insight is that 3D shapes are effectively
characteri...
PointCloud generativeGANsforpointclouds #1a
https://arxiv.org/abs/1707.02392
We build an end-to-end pipeline for 3D point ...
PointCloud generativeGANsforpointclouds #1b
Interpolating between different point clouds, using our latent
space represent...
HardwarePointCloud Super-resolution multiplescans
https://doi.org/10.2312/SPBG/SPBG06/009-015
Cited by 47 articles
On the ...
DeepLearningSuper-Resolution
Plentyofoptionsforimage/video/volumesuper-resolution
https://arxiv.org/abs/1706.03142
https:/...
Point-cloudsuper-resolution
Upsampling‘on-the-fly’toavoid“dataexplosion”?
Jason Schreier
4/17/17 12:05pm Horizon Zero Dawn...
3DContentgeneration VolumetricCapture
Generatecontentbyscanningreal-lifescenesandobjects
Kul Wadhwa's and Roddy O'Hara's U...
3DContentgeneration Automaticphotorealism#1
Stillcanbequitelabor-intensivetocreaterealisticcontent
Get to know Rense de Bo...
3DContentgeneration Automaticphotorealism#2
ConvertingLiDARscanstovisuallyhighquality3Dcontent
Atom View is a new piece of...
3DContentgeneration Styletransfer formaps
Neural Networks and The Future of 3D Procedural Content Generation
by Sam Snider...
3DContentgeneration from Videoto3D
Production-Level Facial Performance Capture Using Deep
Convolutional Neural Networks In...
3DContentgeneration from Video(&Audio) toVideo
Face2Face: Real-time Face Capture and Reenactment of RGB Videos
Justus Thie...
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Upcoming SlideShare
Loading in …5
×

Emerging 3D Scanning Technologies for PropTech

Falling costs with rising quality via hardware innovations and deep learning.

Technical introduction for scanning technologies from Structure-from-Motion (SfM), Range sensing (e.g. Kinect and Matterport) to Laser scanning (e.g. LiDAR), and the associated traditional and deep learning-based processing techniques.

Note! Due to small font size, and bad rendering by SlideShare, better to download the slides locally to your device

Alternative download link for the PDF:
https://www.dropbox.com/s/eclyy45k3gz66ve/proptech_emergingScanningTech.pdf?dl=0

  • Login to see the comments

Emerging 3D Scanning Technologies for PropTech

  1. 1. Emerging 3D Scanning Technologies for PropTech Falling costs with rising quality via hardware innovations and deep learning
  2. 2. Outlineofthepresentation StructurefromMotion(SfM) Low-cost passive sensing 360°imaging Omnidirectional immersiveimagesandvideos Rangesensing Structuredlight, Matterport,Kinectforexample Laserscanning LiDARs fromVelodyne for example Data-drivenprocessing DeepLearning 3DDatasets Withwhat totrain yourdeeplearningpipelines FutureProspects Short overview of future applications Thepresentationismeant asatechnical introductionfor typical hardware andsoftware processingtechniquesusedinreal estateandconstruction site scanning. Computerscientistsnew to proptechorganizations andreal estate fieldin generalmight especiallyfindthispresentation useful.One assumesthat thereaderisfamiliarwiththe basics ofdeeplearning.
  3. 3. Datastructuresfor realestatescans RGB+D Pixel grid presenting colorand depth Example from Prof. Li Mesh(Polygon) from voxel data(“3Dpixels”) Voxel grid meshing using marching cubes (StackExchange) PointCloud unordered datatypically (i.e. not on agrid but sparse
  4. 4. PropTechResources for domaininsights https://www.inman.com/ Inman Hacker Connect is created by and for the real estate technology community. Debate, discuss and define the future of real estate’s most pressing tech issues at Hacker Connect. Join more than 400 engineers, developers, designers, product managers, database architects, webmasters, and technology executives from across the real estate space. Build partnerships, connect with peers, tackle thorny tech issues, learn best practices discover innovative breakthroughs and collaborate during special hands-on keyboard sessions at this day-long, tech- first event. WHY YOU SHOULD ATTEND Hear from industry leaders on APIs, bots, data security, ownership, user experience, blockchain and more. Take part in collaborative hands-on-keyboard sessions and come out with a new tool to apply to your job. Learn how to better integrate data, workflows and be competitive in your recruitment efforts https://www.inman.com/event/hacker-17-sf/ http://www.moderneventures.com/accelerator/ https://gust.com/accelerators/moderne-accelerator (Pi Labs) is Europe’s first venture capital platform investing exclusively in early stage ventures in the property tech vertical. London, United Kingdom. http://pilabs.co.uk/ http://www.jamesdearsley.co.uk/ “The only PropTech site for the latest Property Technology news and views” #PropTech community across Europe. Join us for our next event in #Berlin http://futureproptech.de/
  5. 5. StructurefromMotion(SfM) Low-costpassivesensing
  6. 6. StructurefromMotionBasics Structure-from-Motion (SfM). Instead of a single stereo pair, the SfM technique requires multiple, overlapping photographs as input to feature extraction and 3-D reconstruction algorithms. - Westoby et al praehistorische-archaeologie.de - Florian Tubbesing Structure from Motion can achieve good accuracy compared to laser scanners. James and Robson (2012) Cited by 281 Articles, and see Related articles This volcanic bomb (~10 cm across) from Soufrière Hills volcano was scanned by an Arius3d laser scanner ( Stuart Robson, University College London) and also reconstructed using the SfM-MVS technique, with the results scaled by sfm_georef. Differences between cross sections through the two models have RMS values of ~0.3 mm. Point cloud: low res (6 Mb) http://www.lancaster.ac.uk/staff/jamesm/software/sfm_georef.htm SfM method basically computes the relative camera positions between all related photos. After every relative camera position is found, the scheme uses these matrices to reconstruct all feature points using triangulation. Thus there are two main problems: 1) Image registration (e.g. SIFT, SURF, ORB, etc) 2) Pose Estimation (e.g. Perspective-n-Point with RANSAC) By Dr Calle Olsson https://www.youtube.com/watch?v=i7ierVkXYa8
  7. 7. StructurefromMotionLiteratureReferences https://doi.org/10.1016/j.geomorph.2012.08.021 Cited by 631 articles, and see Related articles https://arxiv.org/abs/1701.08493 Structure-from-Motion’ (SfM) operates under the same basic tenets as stereoscopic photogrammetry, namely that 3-D structure can be resolved from a series of overlapping, offset images. However, it differs fundamentally from conventional photogrammetry, in that the geometry of the scene, camera positions and orientation is solved automatically without the need to specify a priori, a network of targets which have known 3-D positions. Instead, these are solved simultaneously using a highly redundant, iterative bundle adjustment procedure, based on a database of features automatically extracted from a set of multiple overlapping images (Snavely et al 2008). Finally, even though there exist various theoretical works in the literature that study fundamental problems in SfM and/or provide rigorous analysis of stability and robustness of specific methods, we believe that the SfM community would still highly benefit from rigorous results on fundamental problems (e.g., what is the theoretically maximal amount of mismatched features or level of noise in the images that can be tolerated for a stable structure recovery, and can this be achieved efficiently?) and theoretical analysis of stability, robustness and computational efficiency of existing or new methods
  8. 8. SLAM Simultaneouslocalizationandmapping SLAM, Visual Odometry, Structure from Motion, Multiple View Stereo Yu Huang, Senior Architect, Autonomous Driving@Baidu USA https://www.slideshare.net/yuhuang/visual-slam-structure-from-motion-multiple-view-stereo Samsung R&D Institute Necessary Skills / Attributes: ● 5+ years’ experience delivering computer vision based products using C++ or Python (Masters or PhD study will be considered). ● Theoretical and practical understanding of multi-view geometry and 3D reconstruction. ● Experience with machine learning techniques within a computer vision context. ● PhD/MS in Computer Vision, Artificial Intelligence or Machine Learning. ● Expertise with Deep Neural Networks using TensorFlow or Keras. SLAM stands for Simultaneous Localization and Mapping and one way to understand it is to imagine yourself entering an unfamiliar building for the first time. As you move about the building, you don't completely forget where you have already been. Indeed, at any moment you have a pretty good idea where you are within the current map that you have so far constructed in your head, and unless you have a really bad sense of direction, you could probably turn around and get back out of the building without too much trouble. Finding your way around the building is a good example of simultaneously constructing a map and localizing yourself within that map. http://www.pirobot.org/blog/0015/
  9. 9. SLAM Traditionalalgorithm comparison http://dx.doi.org/10.1186/s41074-017-0027-2 The framework is mainly composed of three modules as follows. 1) Initialization 2) Tracking 3) Mapping Additional modules for stable and accurate vSLAM + Relocalization +Global map optimization “ From the technical point of views, there is no definitive difference between SLAM and real-time SfM.” Even though visual SLAM algorithms have been developed since 2003, vSLAM is still an active research field. Each algorithm has different characteristics. We need to choose an appropriate algorithm by considering a purpose of an application.
  10. 10. VisualOdometry Taketomi et al. (2017): http://dx.doi.org/10.1186/s41074-017-0027-2 “Odometry is to estimate the sequential changes of sensor positions over time using sensors such as wheel encoder to acquire relative sensor movement. Camera-based odometry called visual odometry (VO) is also one of the active research fields in the literature [16, 17]. From the technical point of views, vSLAM and VO are highly relevant techniques because both techniques basically estimate sensor positions. According to the survey papers in robotics [18, 19], the relationship between vSLAM and VO can be represented as follows. vSLAM = VO + global map optimization The relationship between vSLAM and VO can also be found from the papers [20, 21] and the papers [22, 23]. In the paper [20, 22], a technique on VO was first proposed. Then, a technique on vSLAM was proposed by adding the global optimization in VO [21, 23].” Towards stable visual odometry & SLAM solutions for autonomous vehicles https://www.youtube.com/watch?v=T5Y6OPG-d08 NavStik Hackerspace | Projects at Hackerspace Visual Odometry using Optic Flow
  11. 11. SoftwareOpen-sourceVisualSFM VisualSFM:AVisualStructurefromMotion System Changchang Wu Cited by 326 articles, and see Related articles VisualSFM is a GUI application for 3D reconstruction using structure from motion (SFM). The reconstruction system integrates several of my previous projects: SIFT on GPU(SiftGPU), Multicore Bundle Adjustment, and Towards Linear-time Incremental Structure from Motion . VisualSFM runs fast by exploiting multicore parallelism for feature detection, feature matching, and bundle adjustment. Using VisualSFM and Meshlab as an offline alternative to Autodesk's excellent 123D catch. I walk you through my workflow for converting multiple images into a 3D model suitable for use in Blender. Tutorial for amateur photographers by Jamie Fuller. https://www.youtube.com/watch?v=V4iBb_j6k_g OpenSourcePhotogrammetrywithVisualSFM: Ditching123DCatchJuly12,2013 by Jesse Indoor Navigation from Multiple Images By Jaan Tollander de Balsch, 2016, Aalto https://jaantollander.github.io/SCI-C1000/pr ototype.html What is the best method for 3D object modelling and reconstruction from photos or videos taken by flying robots or drones? What is the accuracy of such reconstruction methods with regards to the vibrations of the flying drones, quality of camera and resolution? Is it possible to improve the results by organizing multiple flights and overlaying/accumulating the data in the point cloud? Is there any free software available?
  12. 12. SoftwarePythonPhotogrammetryToolbox(PPT)GUI Real photo x SfM with texture color x SfM with simple shader. Made with Python Photogrammetry Toolbox GUI and rendered in Blender with Cycles. http://184.106.205.13/arcteam/ppt.php https://github.com/archeos/ppt-gui/ Converting pictures into a 3D mesh with PPT, MeshLab and Blender http://arc-team-open-research.blogspot.co.uk/2012/09/converting-pi ctures-into-3d-mesh-with.html Blender camera tracking + Python Photogrammetry Toolbox http://arc-team-open-research.blogspot.co.uk/2012/11/blender-camer a-tracking-python.html The video show the skull reconstructed in 3D with Python Photogrammetry Toolkit GUI. Smilodon, the 3D reconstruction of the saber-toothed cat http://arc-team-open-research.blogspot.co.uk/2013/03/
  13. 13. Open-sourcelibraries forSfM OpenSfM is a Structure from Motion library written in Python on top of OpenCV. The library serves as a processing pipeline for reconstructing camera poses and 3D scenes from multiple images. https://github.com/mapillary/OpenSfM 656 stars OpenSfM OpenMVG (Multiple View Geometry) "open Multiple View Geometry" is a library for computer-vision scientists and especially targeted to the Multiple View Geometry community. https://github.com/openMVG/openMVG 1,1856 stars OpenMVG https://doi.org/10.1007/978-3-319-56414-2_5 http://imagine.enpc.fr/~marletr/publi/RRPR-2016 -Moulon-et-al.pdf Sung and Lin (2017): “VisualSFM uses the pre- emptive feature matching, the incremental structure from motion and the re-triangulation techniques. The incremental feature matching can greatly speed up the process because this kind of matching will first sort all feature points and match only first h feature points for each photo.” Sung and Lin (2017): “OpenMVG also contains incremental structure from motion technique. Besides that, they proposed a new iterative sampling method called a contrario Random Sample Consensus (AC-RANSAC) as a substitution to the original RANSAC in order to acquire higher precision and better performance. The AC-RANSAC using the “a contrario” methodology in order to find a model that best fits the data with a threshold T that adapts automatically to the noise. Hence, it is able to find a model and its associated noise without a fixed threshold.”
  14. 14. Open-sourcelibraries forSfM+SLAM OpenChisel https://github.com/personalrobotics/OpenChisel An open-source version of the Chisel chunked TSDF library. It contains two packages: open_chisel open_chisel is an implementation of a generic truncated signed distance field (TSDF) 3D mapping library; based on the Chisel mapping framework developed originally for Google's Project Tango. It is a complete re-write of the original mapping system (which is proprietary). open_chisel is chunked and spatially hashed inspired by this work from Neissner et. al, making it more memory-efficient than fixed-grid mapping approaches, and more performant than octree-based approaches. A technical description of how it works can be found in our RSS 2015 paper. http://ri.cmu.edu/pub_files/2015/7/ChiselPaper.pdf
  15. 15. Research-gradeSfM old-school monovideo http://dx.doi.org/10.1186/s13640-017-0168-3 Inspired by the structure from motion systems, we propose a system that reconstructs sparse feature points to a 3D point cloud using a mono video sequence so as to achieve higher computation efficiency. The system keeps tracking all detected feature points and calculates both the amount of these feature points and their moving distances. We only use the key frames to estimate the current position of the camera in order to reduce the computation load and the noise interference on the system. Furthermore, for the sake of avoiding duplicate 3D points, the system reconstructs the 2D point only when the point shifts out of the boundary of a camera. In our experiments, we show that our system is able to be implemented on tablets and can achieve state-of-the-art accuracy with a denser point cloud with high speed.
  16. 16. Research-gradeSfM DeepLearning -based#1
  17. 17. Research-gradeSfM DeepLearning -based#2 https://arxiv.org/abs/1702.01381, 2 May 2017 We evaluated the performance of our proposal on the DTU dataset comparing it with two traditional feature based methods, namely SURF (Cited by 8683 articles) and ORB ( Cited by 2739 articles). The system is trained in an end-to-end manner utilising transfer learning from a large scale classification dataset. In addition, a variant of the proposed architecture containing a spatial pyramid pooling (SPP) layer is evaluated and shown to further improve the performance. RegNet is able to correct even large decalibrations such as depicted in the top image. The inputs for the deep neural network are an RGB image and a projected depth map. RegNet is able to establish correspondences between the two modalities which enables it to estimate a 6 DOF extrinsic calibration. Additionally, with an iterative execution of multiple CNNs, that are trained on different magnitudes of decalibration, our approach compares favorably to state-of-the-art methods in terms of a mean calibration error of 0.28º for the rotational and 6 cm for thetranslation components even for large decalibrations up to 1.5 m and 20º . https://arxiv.org/abs/1702.02295
  18. 18. Research-gradePose/Structure DeepLearning -based#1 Essentially the same technology for stereo matching and depth map generation as for SfM https://arxiv.org/abs/1703.04309 https://arxiv.org/abs/1704.07813 Empirical evaluation on the KITTI dataset demonstrates the effectiveness of our approach: 1) monocular depth performs comparably with supervised methods that use either ground-truth pose or depth for training, and 2) pose estimation performs favorably compared to established SLAM systems under comparable input settings.
  19. 19. Research-gradePose/Structure DeepLearning -based#2 GANs on everything, so here as well :) The usefulness of VisualSFM/ openSFM/ openMVG for defensible startup products? Inversion is often ambiguous, e.g., many compositions of 3D shape and camera pose give rise to the same 2D projection. To address this ambiguity, we impose priors on the predicted latent factors, through an adversarial discriminator network trained to discriminate between predicted factors and ground-truth ones. Training adversarial inversion does not require input-output paired annotations, but merely a collection of ground-truth factors, unrelated (unpaired) to the current input. Our model can thus be self-supervised by unlabelled image data, by minimizing a joint reconstruction and adversarial loss, complementing any direct supervision provided by paired annotations. Applying adversarial inversion to super-resolution and inpainting results in automated “visual plastic surgery” Structure-from-motion(SfM) results with and without adversarial priors. The results of the baseline (columns 5th and 8th) are obtained from a model with depth smooothness prior, trained with early stopping at 40K iterations (before divergence).
  20. 20. SfMonMobileDevices https://arxiv.org/abs/1611.09498 https://doi.org/10.1109/ICCV.2013.15 | Cited by 141 articles, see Related articles https://doi.org/10.1016/j.cviu.2016.09.007 After introducing the reconstruction algorithms at the base of our approach, we show how to build applications able to generate 3D floor plans scaled to their real-world metric dimensions and capable to manage scene not necessary limited by Manhattan World assumptions. Then, exploiting the resulting structural and visual model, we propose a client-server interactive exploration system implementing a low-DOF navigation interface, specifically developed for touch interaction on smartphones and tablets. https://doi.org/10.1145/2999508.2999526
  21. 21. SfMonMobileDevices CaseDacuda Magic Leap, the augmented reality startup that has raised $1.4 billion in funding but has yet to release a product, has made an acquisition to expand its work in computer vision and deep learning, and to build out its operations into Europe. The company has acquired the 3D division of Dacuda, a computer vision startup based out of Zurich. One of Dacuda’s focuses had been developing algorithms for consumer- grade cameras (and not just cameras, but any device with a camera function) to capture 2D and 3D imaging in real time, “making 3D content as easy as taking a video.” https://techcrunch.com/2017/02/18/confir med-magic-leap-acquires-3d-division-of-d As you can see, no detail about what the two might be working on. The acquisition was first rumored last week — after Dacuda posted a note on its blog about selling its 3D division, and then some Dacuda employees updated their LinkedIn profiles as Magic Leap employees (one example here). Tom’s Hardware then speculated it could signal Magic Leap using technology developed by Dacuda to enable room-scale, six degrees of freedom tracking (essentially to improve its image capturing sensors in 3D environments). The ecosystem there is attracting other big-name M&A. Faceshift, a motion capture startup acquired by Apple in 2015, was also founded in Zurich. Facebook’s Oculus VR in August 2016 also quietly acquired a startup called Zurich Eye, incubated at the University of Zurich and ETH, the federal institute of technology. Zurich Eye became the basis of Oculus and Facebook’s office in the city. Zurich Eye, ironically, was co-founded by a three former software engineers from Dacuda (they all now work for Oculus VR). For example, in October the company had linked up with MindMaze, another virtual/augmented reality startup out of Switzerland, to build a platform they were calling “MMI, the world’s first multisensory computing platform for mobile-based, immersive and social virtual reality applications,” MindMaze noted. MindMaze said it planned to “deploy the technology for users globally to address a void left by Google’s DayDream View for positional tracking and multiplayer interactions.” We have contacted Magic Leap for comment and will update this post if and when we learn more.
  22. 22. AppleARKit Technology https://developer.apple.com/arkit/ Since the iPhone 6, iPhones have used what Apple calls “Focus Pixels”, which is its term for phase detection AF. Fast Company reports that system will be replaced with laser autofocus possibly as soon as the next iPhone, which is set to debut this fall. It is likely that Apple would use both AF technologies, as Google does in its Pixel line of phones. The technology would serve a dual purpose, also allowing for better depth perception with the inbuilt camera for augmented reality apps. ARKit rolls out with iOS 11 this fall, so it would make sense to also include the VSCEL laser system in the phone launching at the same time. https://petapixel.com/2017/07/20/apple-bring-3d-laser-autofocus-iphone-cameras-report-says/ https://www.theverge.com/2017/6/26/15872332/apple-arkit-ios-11-augmented-reality-developer-excitement
  23. 23. AppleARKit ExampleApplications https://twitter.com/madewithARKit Measuring kitchen dimensions http://bit.ly/2tJ5KV8 app by→ @SmartPicture3D Measure distances with your iPhone. Clever little #ARKit app by @BalestraPatrick http://bit.ly/2sFl8RB Inter-dimensional iPhone AR portals are closer than they appear http://bit.ly/2sufO0d ARkit demo by @nedd Demo Shows How Augmented Reality Will Make Advertising More Immersive. Mixed reality producer Bilawal Singh Sidhu show peek of what the world of advertising could be with the ARKit. #adtech https://mobile-ar.reality.news/news/apple-ar-demo-shows- augmented-reality-will-make-advertising-more-immersive-0 178905/
  24. 24. Google’s responsetoARKit ARCore DAVID JAGNEUX, UPLOADVR@UPLOADVR SEPTEMBER 2, 2017 6:00 AM “Earlier this week, Google announced ARCore, a software-based solution for making more Android devices AR-capable without the need for depth sensors and extra cameras. It will even work on the Google Pixel, Galaxy S8, and several other devices very soon and supports Java, Unity, and Unreal from day one. In short, it’s kind of like Google’s answer to Apple’s ARKit.” - https://venturebeat.com/2017/09/02/googles-first-arcore-goal-100-million-ar-capable-android-phones/ “Another example, which is especially relevant for developers that build traditional smartphone apps in Java, is that we want to make it easier than ever for people to get into 3D modeling that haven’t done it before,” Bavor says. “We know there are a lot of people that want to get into 3D development and AR but aren’t experts in Maya, or Unity, or anything. So Blocks is an app we built with the intention of enabling people that have never done a 3D model in their life to feel comfortable building 3D assets. We even made it easy to export right from Blocks and pull into ARCore apps you’re developing.”
  25. 25. ARCore tooearlytotellhowitwilldoagainst“AppleCult” Verge Adi Robertson https://youtu.be/NhJydpMkpug FusedVR https://youtu.be/dNXBvDKRg1M https://venturebeat.com/2017/08/29/google-launches-arcore -sdk-in-preview-ar-on-android-phones-no-extra-hardware-re quired/ https://youtu.be/ttdPqly4OF8 Super Ventures Blog Matt Miesnieks CEO 6D.ai, Partner @Super_Ventures, AR technology & cycling https://medium.com/super-ventures-blog/how-is-arcore-better-than-arkit-5223e6b3e79d ● Isn’t ARCore just Tango-lite? ● The iPhone-8-keynote sized elephant in the room ● So should I build on ARCore now? ● Is ARCore better than ARKit? Scottie Gardonio Aug 30 AR / VR enthusiast. Creative Manager. Passionate graphic designer. https://medium.com/iotforall/arcore-vs-arkit-google-counters-apple-33483c08d3da ARCore vs. ARKit: Google Counters Apple Let the Dueling Begin Google announcing inside-out 6-DOF tracking support for Daydream back at Google IO earlier this year.
  26. 26. DeepLearningonMobileDevices https://techcrunch.com/2017/05/17/googles-tensorflow-lite-brings-machine-learning-to-android-devices/ http://blog.stratospark.com/creating-a-deep-learning-ios-app-with-keras-and-tensorflow.html ● 3D Face Capture ● 3D Scene Reconstruction ● 2.5D Scene Reconstruction and Computational Photography ● SLAM and Object Tracking ● Augmented Reality ● Google Cardboard SDK for iOS https://doi.org/10.1109/IPSN.2016.7460664 | Cited by 28 articles, see Related articles Thursday 20 July 2017, Movidius USB stick https://techcrunch.com/2017/07/20/movidius-launches-a-79-deep-learning-usb-stick/ Snapchat secretly acquires Seene, a computer vision startup that lets ... https://techcrunch.com/.../snapchat-secretly-acquires-seene-a- computer-vision-startup-... 3 Jun 2016 https://doi.org/10.1109/PDP.2017.98 https://arxiv.org/abs/1705.06224
  27. 27. 360°imaging
  28. 28. 360°(omnidirectionalimaging) Introduction The Panoptic Camera platform developed jointly by Microelectronic Systems Laboratory (LSM) and Signal Processing Laboratory (LTS2) of EPFL.* http://lsm.epfl.ch/page-52820-en.html Wikipedia: “360-degree videos, also known as immersive videos[1] or spherical videos ,[2] are video recordings where a view in every direction is recorded at the same time, shot using an omnidirectional camera or a collection of cameras. During playback the viewer has control of the viewing direction like a panorama.” Consumer-level camera review http://thewirecutter.com/reviews/best-360-degree-camera/ By DANIEL CULPANWednesday 12 August 2015 http://www.wired.co.uk/article/9-mind-blowing-360-degree-videos Scuba Diving Short Film in 360° Green Island, Taiwan https://youtu.be/2OzlksZBTiA
  29. 29. 360°aspartof “10BreakthroughTechnologiesof2017” https://www.technologyreview.com/s/603496/10-breakthrough-technologies-2017-the-360-degree-selfie/ Seasonal changes to vegetation fascinate Koen Hufkens. So last fall Hufkens, an ecological researcher at Harvard, devised a system to continuously broadcast images from a Massachusetts forest to a website called VirtualForest.io. And because he used a camera that creates 360°pictures, visitors can do more than just watch the feed; they can use their mouse cursor (on a computer) or finger (on a smartphone or tablet) to pan around the image in a circle or scroll up to view the forest canopy and down to see the ground. Journalists from the New York Times and Reuters are using $350 Samsung Gear 360 cameras to produce spherical photos and videos that document anything from hurricane damage in Haiti to a refugee camp in Gaza. One New York Times video that depicts people in Niger fleeing the militant group Boko Haram puts you in the center of a crowd receiving food from aid groups. Or consider the spherical videos of medical procedures that the Los Angeles startup Giblib makes to teach students about surgery. The company films the operations by attaching a $500 360fly 4K camera, which is the size of a baseball, to surgical lights above the patient. The 360° view enables students to see not just the surgeon and surgical site, but also the way the operating room is organized and how the operating room staff interacts. These applications are feasible because of the smartphone boom and innovations in several technologies that combine images from multiple lenses and sensors. For instance, 360° cameras require more horsepower than regular cameras and generate more heat, but that is handled by the energy-efficient chips that power smartphones. Both the 360fly and the $499 ALLie camera use Qualcomm Snapdragon processors similar to those that run Samsung’s high- end handsets. Once people discover spherical videos, research suggests, they shift their viewing behavior quickly. The company Humaneyes, which is developing an $800 camera that can produce 3-D spherical images, says people need to watch only about 10 hours of 360° content before they instinctively start trying to interact with all videos. When you see 360°imagery that truly transports you somewhere else, you want it more and more.
  30. 30. Low-costendSamsung Gear andGalaxy Samsung Gear360, ~£250 Samsung GearVR, ~£100 Samsung Galaxy S6-8, smartphone, ~£200-£700 http://www.samsung.com/uk/wearables/gear-360-c200/ If you’re clamoring to shoot in 360 degrees, the Gear 360 balances simple design with workable image quality — but you really need a Samsung phone (and a Gear VR, and a good hunk of money) to get the most out of it. And, for now, that's fine. This version of the Gear 360 is more likely to be looked back on as a relic anyway, a recognizable but eventually dismissible attempt at a new idea, and the foundation for whatever Samsung does next.
  31. 31. Low-costend#2Ricoh Theta Ricoh’s Theta V 4K camera sports 360- degree video and wireless playback RYAN WINTERHALTER, UPLOADVR@@UPLOADVR SEPTEMBER 02, 2017 07:03 PM https://venturebeat.com/2017/09/02/ricohs-theta-v-4k-camera-sport s-360-degree-video-and-wireless-playback/ Ricoh is unveiling its latest 360-degree camera this morning. Dubbed the Ricoh Theta V, the $430 4K camera is the latest in the line which launched in 2013 with the Ricoh Theta. Available for pre-order now, and shipping in mid-September, the Theta V features 3,820-by-1,920 resolution video capture. That’s a massive improvement on the earlier Theta S, which offered a sub-1,080p 1,920-by-960, and the Theta SC, which allowed for 1,920-by-1,080 recording. Perhaps the biggest usability improvement to the Theta V is the inclusion of remote playback. Users can now wirelessly stream their video to an external display directly from the camera. Previous devices in the Theta line (except the developer-only Theta R) required users to export their raw footage into a computer to stitch the image and create a useable video. That’s now all done on the device. Videographers can watch their footage on any display, and move the POV by moving the camera itself. The Theta V boosts sound quality as well. Four microphones capture data from their respective dimensions, creating spatial audio that allows users to hear where the sound is coming from within the recording. Ricoh Theta V hands-on Published Aug 31, 2017 | Jeff Keller Based on some quick tests of a non-final Theta V, both stills and videos are noticeably better than those from its predecessor. We're looking forward to getting our hands on a production model in a few weeks and putting it through its paces. For higher quality audio capture, Ricoh is offering the TA-1 3D Microphone ($269). Developed by Audio Technica, the mic attaches via the tripod mount and uses a standard 3.5mm audio jack.
  32. 32. HigherEndGoPro, Nokia Ozo, FacebookSurround, etc. GoPro (NASDAQ:GPRO) recently unveiled the Omni, a six-camera rig for filming interactive spherical videos that can be explored through a smartphone's movements, a user's finger swipes, or a virtual reality headset. The device is the smaller sibling of the 16-camera Odyssey rig ($15,000), which hasn't been launched despite being announced nearly a year ago. Let's take a look at four key things investors should know about the Omni ($3,500), and how they might impact GoPro's future. https://www.fool.com/investing/general/2016/04/14/4-things-inves tors-need-to-know-about-gopro-incs-o.aspx What's next for GoPro? GoPro investors don't have many catalysts to look forward to this year. The Omni is too pricey relative to its peers to gain any mainstream traction. The Karma drone, which is due to arrive within the next two months, faces tough competition from market leader DJI Innovations. By the time the Hero 5 cameras arrive near the end of the year, the mainstream market could be saturated with cheap VR and flying cameras. Introducing Facebook Surround 360: An open, high-quality 3D-360 video capture system Brian K Cabral, April 12, 2016 ● Facebook has designed and built a durable, high- quality 3D-360 video capture system. ● The system includes a design for camera hardware and the accompanying stitching code, and we will make both available on GitHub this summer. We're open-sourcing the camera and the software to accelerate the growth of the 3D-360 ecosystem — developers can leverage the designs and code, and content creators can use the camera in their productions. ● The system exports 4K, 6K, and 8K video for each eye. The 8K videos double industry standard output and can be played on Gear VR with Facebook's custom Dynamic Streaming technology. https://code.facebook.com/posts/1755691291326688/introduc ing-facebook-surround-360-an-open-high-quality-3d-360-vid eo-capture-system/ https://www.theverge.com/2016/4/25/11421992/disney-nokia-oz o-camera-virtual-reality-star-wars-marvel Ever since Nokia announced its 360-degree Ozo virtual reality camera it has positioned the system as a high-end option for Hollywood filmmakers, and today the company is announcing a partnership with Disney that should help deliver on that promise. As part of the deal, Ozo cameras will be put into the hands of Disney filmmakers and its marketing teams to create 360-degree, virtual reality content across all of the studio’s various brands.
  33. 33. LytroImmerge The world'sfirst professional Light Field solution forcinematicVR roadtovr.com/lytros-immerge-360 https://www.lytro.com/immerge Consequently, to create a virtual reality that even the human eye cannot distinguish from the real world, we must achieve the perfect immersive viewing experience, such that human viewers feel they can walk into the scene. This is known as the virtual walk-in effect, and it requires light-field technology—3D imaging technology that emerged from the field of computational imaging/photography to capture the light rays that people perceive from different locations and directions. When combined with computer vision and deep learning, light- field technology provides a viable path for producing low-cost, high-quality VR content, positioning this technology to be the most profitable segment of the VR industry.
  34. 34. “DepthLytro”‘Depth sensing with light fieldtechniques Refocusing in spite of foreground occlusions: (a) Scene containing a monkey toy being partially occluded by a plant in the foreground, (b) traditional synthetic aperture refocusing on light field is partially effective in removing the effect of foreground plants, (c) synthetic aperture refocusing of depth displays corruption due to occlusion, (d) histogram of depth clearly shows two clusters corresponding to plant and monkey, (e) virtual aperture refocusing after removal of plant pixels shows sharp depth image of monkey, (f) Quantitative comparison of indicated scan line of the monkey’s head for (c) and (e) We use coding techniques from Tadano et al. (2015) to image beyond backscattering nets. Notice how the corrupted depth maps are improved using the codes. We show how digital refocusing can be performed on the images without the scattering occluders by combining depth fields with coded TOF. https://arxiv.org/abs/1509.00816
  35. 35. Post-processingfor360° imaging https://doi.org/10.1007/s00371-017-1368-7 Overall process. a Input image. b Lines detected and classified: red for vertical lines and yellow for horizontal lines. c Great circles from the classified lines. Green dots are vanishing points computed from horizontal (yellow) lines. d Upright adjustment result We implemented our method using C++ and the OpenCV library on a 64-bit Windows PC with an Intel i7- 6700K 4.00GHz CPU and 32GB RAM. For an input image of size 5376 × 2688 px, it takes a few hundred milliseconds (less than one second) to obtain the final rotation matrix R for upright adjustment. https://arxiv.org/abs/1703.10798 http://vllab1.ucmerced.edu/~wlai24/360hyperlapse Pipeline of the proposed algorithm. Given a 360 video, we first stabilize the sequence to smooth the relative rotation◦ between adjacent frames. We estimate the focus of expansion (i.e., the direction of forward motion) as a prior information for our camera path planning. To extract the regions of interest, we compute the spatial-temporal saliency and semantic segmentation. The detected regions of interest are used to guide the camera path planning. Finally, we use an adaptive 2D video stabilization to render a smooth hyperlapse.
  36. 36. 360°DeepLearning #1 http://dx.doi.org/10.3390/s17061341 https://arxiv.org/abs/1705.01759 Watching a 360º sports video requires a viewer to continuously select a viewing angle, either through a sequence of mouse clicks or head movements. To relieve the viewer from this “360 piloting” task, we propose “deep 360 pilot” – a deep learning-based agent for piloting through 360º sports videos automatically Panel (a) overlaps three panoramic frames sampled from a 360 skateboarding video◦ with two skateboarders. One skateboarder is more active than the other in this example. For each frame, the proposed “deep 360 pilot” selects a view – a viewing angle, where a Natural Field of View (NFoV) (cyan box) is centered at. It first extracts candidate objects (yellow boxes), and then selects a main object (green dash boxes) in order to determine a view (just like a human agent). Panel (b) shows the NFoV from a viewer’s perspective.
  37. 37. 360°DeepLearning #2 Flat2Sphere: Learning Spherical Convolution for Fast Features from 360° Imagery Yu-Chuan Su, Kristen Grauman (Submitted on 2 Aug 2017) https://arxiv.org/abs/1708.00919 We propose to learn a spherical convolutional network that translates a planar CNN to process 360° imagery directly in its equirectangular projection. Our approach learns to reproduce the flat filter outputs on 360° data, sensitive to the varying distortion effects across the viewing sphere. The key benefits are 1) Efficient feature extraction for 360°images and video, and 2) The ability to leverage powerful pre- trained networks researchers have carefully honed (together with massive labeled image training sets) for perspective images. We validate our approach compared to several alternative methods in terms of both raw CNN output accuracy as well as applying a state-of-the-art "flat" object detector to 360° data. Our method yields the most accurate results while saving orders of magnitude in computation versus the existing exact reprojection solution.
  38. 38. 360°Therolein PropTech? #1a Usefor real estate agents, still a novelty/gimmicky? (from 2014 until 2017) MAY 26, 2014 By James Dearsley http://www.jamesdearsley.co.uk/is-the-property-industry-intereste d-in-360-degree-hd-filming/ USES OF 360 DEGREE HD FILMING IN REAL ESTATE: 1. Sales and Marketing. Firstly, from a realtor or estate agent perspective there are several uses here of 360 degree cameras, the first being obvious, that of sales and marketing. It will be simple and efficient to take a quick film of each room, or just walk through the property with these devices to record what you need 2. Property Management issues. We have also seen interest from companies looking to use these bits of equipment for inventory taking. Seeing as they are of HD quality it means you can quickly take photographs of properties which can later be looked at in more detail should problems arise in letting disputes. 3. Virtual Reality. With Facebook recently buying Oculus Rift for $2 Billion, it is getting less far fetched. Considering the price of an Oculus is relatively cheap (reckoned to be less than $500/£360 when released next year) it would not be surprising if Facebook are hoping for a lot of people to be purchasing these (Candy Crush Saga in Virtual Reality anyone?!). It isn’t just Facebook though; Sony have a VR headset in production as does Samsung (it was recently announced) and so this space is going to move quickly. By using these cameras you can put your clients into these homes very quickly and easily – either in the office, if you get a set of these yourself, or, in time, in their own home if Facebook get their way. https://www.forbes.com/sites/forbesagencycouncil/2017/06/28/want-to-use-360 -degree-photo-and-video-11-things-to-consider/#22fffa955002 1. I would recommend that marketers stay on the sidelines until the industry matures. - Kristopher Jones, LSEO.com 4. Use A Strategic Approach The capabilities of 360-degree photo/video have powerful applications in many industries, including real estate, retail and tourism. A 360-degree view has a better chance of selling a house than a static image. - Brock Murray, seoplus+ 7. Prepare For Tomorrow's Consumer Expectations Today, 360-degree photos and videos are very helpful in industries such as the auto industry or real estate where visualizing the product is essential. As VR continues to grow, 360-degree photos and videos will likely become a standard. The consumers' expectations will likely adjust to needing to learn more about the overall "360-degree" experience of the restaurant for example, not just a picture of the dish. - Ahmad Kareh, Twistlab Marketing 11. Create An Emotional Connection 360-degree multimedia is a brilliant tool for meaningful storytelling, as it allows the consumer to be transported to the experience you want them to have, bringing the story to life. Companies should take advantage of these tools to transform products into experiences, cultivating an immersive and emotional connection with the brand. - Joey Hodges, Demonstrate PR JUN 28, 2017 by Forbes Agency Council
  39. 39. 360°Therolein PropTech? #1b Usefor real estate agents A four-wheeled tripod outfitted with a computer, 360- degree camera and sensors can roam properties, producing highly choreographed, immersive videos that would be difficult — if not impossible — to replicate with a normal video camera. VirtualAPT (Brooklyn, NYC) offers residential tour service at now $1/ft² (~10.8$/m²), and for commercial uses, for a monthly fee per building or $0.50/ft² (~5.4$/m²) for separate units. Generated by technology from companies such as Matterport, 3-D home tours allow users to jump between 360-degree photos — sometimes situated within a 3-D model. ● A rover can shoot 360-degree footage of a home while moving along a pre-plotted route. ● Made by VirtualAPT, the videos can include on-camera presentations from real estate agents. ● They're an alternative to 3-D homes tours from companies such as Matterport. https://www.youtube.com/watch?v=JhfQK-tDvGU
  40. 40. 360°Therolein PropTech? #2a Use forconstruction andasatoolforconstructing4D/5D/6DBIM (BuildingInformationModel) Construction site manager manually taking photos of the progress. - Time-consuming to walk through and take photos - No full coverage of site - Might forget some spots - Nice initial 3D BIM not properly maintained during construction site. + Ideally have a drone inspecting the whole construction site with an on- board 360 degree video and a LIDAR / laser scanner. + One can go back in time and see who of the subcontractors for example are responsible for possible problems https://doi.org/10.1186/s40327-014-0016-9
  41. 41. 360°Therolein PropTech? #2b 360 videos registered or not to 3D BIM model allows inspection of the progress (“4D BIM”) in the construction site also retrospectively, and can possibly reduce legal battles when it is clearer who is the one to be held responsible in case of discrepancies between as-built and as-planned data. VISUAL ASSET MANAGEMENT Visual Asset Management (VAM) service digitizes industrial and infrastructure assets using 360 degree images, 3D Models, and relative asset information. 3D MODELING We thrive on enabling 3D realistic visualization to projects while preserving the minute details necessary to portray our world. 360 VIDEO 360 video enables viewers to be at the center of any medium, allowing for a unique visual experience and situational awareness from any device. VIRTUAL REALITY OcuTech’s virtual reality solutions stimulate creative thinking and enhanced information sharing allowing for one of kind virtual experience. Ocutech from Houston, Texas, USA is already providing these type of services https://ocutech360.com/3d-architectural-visualization-solution/#3dvrvideo
  42. 42. 360°imaging+SfM
  43. 43. 360°intosmartphones howbigwillitbe? https://www.engadget.com/2017/07/10/future-of-smartphone-camera/ 1) Augmented reality 2) Dual-lens cameras 3)Better lenses 4)4K recording 5)Thermal imaging 6)Optical zoom 7)360 video “Several smartphone makers, including Samsung and Huawei, have already released add-on 360- degree cameras for their handsets, but this is something that could eventually be integrated into the phones themselves. Immersive 360-degree videos are gradually making their mark, with Facebook among the big firms pushing the technology, while virtual reality companies are gradually introducing more 360-VR content that be viewed from mobile phones.” https://techcrunch.com/2016/08/30/the-future-of-mobile-video-is-virtual-reality/ Are 360 cameras the future? https://youtu.be/i8EUerX90-0 TechAltar So whether teens in big numbers will ever apply Snapchat bunny ears to immersive 360 degree videos?
  44. 44. 360°intosmartphones plentyofoptionscoming#1 Acer’s new Holo 360 degree camera is essentially a smartphone Acer has announced its entry into the VR video market with a device that’s half 360-degree camera, half smartphone. http://www.trustedreviews.com/news/acer-s-new-ho lo-360-degree-camera-is-essentially-a-smartphone -2953609 Paul Monckton CONTRIBUTOR I write about photography and related subjects https://www.forbes.com/sites/paulmonckton/2016/05/31/worlds-first-live-smartphone-vr-camera/#9 fea6921a8b0 Yesterday at this year’s Computex trade show in Taipei, Quanta Computer and ImmerVision jointly announced what is claimed to be the world’s first 360-degree live VR streaming camera for smartphones, with demos starting from today. The, as yet unnamed, camera fits in the palm of the hand and is designed to attach magnetically to any smartphone. It comes with a 360-degree by 187-degree lens and uses a Sony Exmor-HDR imaging sensor to produce 16 megapixel panoramic images. ImmerVision's Panamorph lens makes more efficient use of an image sensor (Image credit: ImmerVision) THIS ADD-ON CAMERA WILL TURN YOUR SMARTPHONE INTO A 360 CAMERAJULY 26, 2017 ION360 U 4K 360-Degree Smartphone Camera is comprised of a 360 camera that goes on top of Essential's 360 Camera Is the World's Smallest 360-Degree Personal Camera for a Smartphone 30 May 2017 http://gadgets.ndtv.com/mobiles/news/essentials-360-camera-is-the-worlds-sm allest-360-degree-personal-camera-for-a-smartphone-1705826 After months of teasing, Android creator Andy Rubin has finally unveiled the Essential Phone that features a near bezel-less display that tries to outdo Samsung's Galaxy S8. Essential's 360 camera, which weighs around 35 grams and is being called the world's smallest 360- degree personal camera by the company, includes a dual 12-megapixel fisheye sensors that can capture 4K 360 video at 30fps. The camera also features 4 microphones to capture sound in 3D. The 360 camera can be bought along with the Essential Phone for an additional $50, or can be bought separately which will cost you $199. @essential, Palo Alto, CA, essential.com
  45. 45. 360°intosmartphones plentyofoptionscoming#2 ProTruly’s Darling https://www.theverge.com/2017/3/5/14809 182/protruly-darling-360-degree-camera- smartphone A company called HT Optical that makes the cameras found on ProTruly’s devices. The company said that it is working on a much smaller 360 camera module that will actually fit into a 7.6 mm thick smartphone and will be capable of capturing 16 MP photos and shoot 4K videos. What’s even more interesting is that the module will only add an extra 1 mm to the overall thickness of a device. https://www.theverge.com/ci rcuitbreaker/2017/2/22/1469 8026/huawei-360-degree-came ra-honor-vr-smartphones http://360rumors.com/ https://www.vrfocus.com/2017/07/360-degree-video-edi ting-app-for-smartphones/ V360 -360 video editor Avincel GroupInc  360-DegreeVideo Editing App ForSmartphonesV360editingsuite alreadyout for Android, withiOS versioncomingsoon.
  46. 46. 360°intosmartphones convergencewith AI players of course https://www.embedded-vision.com/news/movidius-low-po wer-vpu-technology-delivers-4k-vr-pixel-processing-p erformance-motorola%E2%80%99s-newest Movidius’ Myriad 2 Vision Processing Unit (VPU) technology, known for its image signal processing and computer vision capabilities with high energy efficiency, was selected by Motorola Mobility to power their newest Moto Mod: the 360 Camera. Moto Mods are unique modular accessories for Motorola smartphones that bring advanced functionality beyond traditional smartphone features. Motorola’s newest Moto Mod brings users the ability to live stream 360 videos⁰ while preserving battery life. Say Hello to the moto z² Force Edition with moto mods https://www.youtube.com/watch?v=0moMnChM6Ds https://www.wsj.com/articles/intel-to-buy-semiconduct or-startup-movidius-1473170441 https://www.altera.com/solutions/industry/automotive/applicat ions/drive-assistance/surround-view-camera.html http://www.nvidia.co.uk/object/drive-px-uk.html
  47. 47. 360°VideoSfM Obviousextensiontocombineboth Instead of manuallyrotatingyour camera,image all angles simultaneously while going through the rooms in an apartment https://uploadvr.com/adobe-algorithm-6dof-360-cam/ http://variety.com/2017/digital/news/adobe-6dof-vr-v ideo-algorithms-1202394491/ Adobe Motion Parallax demo https://youtu.be/37Z4f6p1HOY https://www.roadtovr.com/adobes-new-research-aims-give-depth-monoscopic-360-video/: Other techniques to achieve 6-DoF VR video usually require light-field cameras like HypeVR’s crazy 6k/60 FPS, LiDAR rig or Lytro’s giant Immerge camera. While these undoubtedly will produce a higher quality 3D effect, they’re also custom-built and ungodly expensive. 6-DOF VR videos with a single 360-camera Jingwei Huang ; Zhili Chen ; Duygu Ceylan ; Hailin Jin, Virtual Reality (VR), 2017 IEEE http://dx.doi.org/10.1109/VR.2017.7892229, 18-22 March 2017 Given a 360-video captured by a single spherical panorama camera, in an offline pre-processing stage, we recover the camera motion and the scene geometry first by performing structure-from-motion (SfM) followed by dense reconstruction. Then, in real-time we playback the video in a VR headset where we track the 6-DOF motion of the headset and synthesize new views by a novel warping algorithm.
  48. 48. 360°VideoSfM KoreaAdvanced Institute ofScience andTechnology(KAIST) Spherical panoramic cameras (Ricoh Theta S, Samsung Gear 360 and LG 360) Our sphere sweeping algorithm enables to compute all-around dense depth maps, minimizing the loss of spatial resolution. With the estimated all-around image and depth map, we have shown practical utilities by introducing 360 stereoscopic and anaglyph◦ images as VR contents. European Conference on Computer Vision ECCV 2016: Computer Vision – ECCV 2016 pp 156-172 https://doi.org/10.1007/978-3-319-46487-9_10 All-Around Depth from Small Motion with a Spherical Panoramic Camera. Sunghoon ImEmail author Hyowon Ha François Rameau Hae-Gon Jeon Gyeongmin Choe In So Kweon
  49. 49. RangeSensing Structured-LightandTime-of-Flight
  50. 50. MicrosoftKinect Democratizing structuredlightscanning https://arxiv.org/abs/1505.05459 Structured light A sequence of known patterns is sequentially projected onto an object, which gets deformed by geometric shape of the object. The object is then observed from a camera from a different direction. By analyzing the distortion of the observed pattern, i.e. the disparity from the original projected pattern, depth information can be extracted The Time-of-Flight (ToF) technology is based on measuring the time that light emitted by an illumination unit requires to travel to an object and back to the sensor array. The Kinec tToF camera applies this CW intensity modulation approach. . Due to the distance between the camera and the object (sensor and illumination are assumed to be at the same location), and the finite speed of light c, a time shift [s]φ is caused in the optical signal which is equivalent to a phase shift in the periodic signal. This shift is detected in each sensor pixel by a so-called mixing process. The time shift can be easily transformed into the sensor-object distance as the light has to travel the distance twice, Cited by 65 articles - see Related articles
  51. 51. KinectFusion Scanning with Kinect https://doi.org/10.1145/2047196.2047270 Cited by 1356 articles, see Related articles https://arxiv.org/abs/1704.01047 https://arxiv.org/abs/1612.02859 The semantic cue from floorplan (i.e., door detection) resolves ambiguities. The figure shows the best placement based on the unary potential with or without the semantic cue We show qualitative results on ModelNet using the TSDF encoding (Curless and Levoy, 1996) and 4 views. The same TSDF truncation threshold has been used for traditional fusion, our OctNetFusion approach and the ground truth generation process. While the baseline approach is not able to resolve conflicting TSDF information from different viewpoints, our approach learns produce a smooth and accurate 3D model from highly noisy input. By learning the structure of real world 3D objects and scenes, our approach is further able to reconstruct occluded regions and to fill gaps in the reconstruction. We evaluate our approach extensively on both synthetic and real-world datasets for volumetric fusion. Further, we apply our approach to the problem of 3D shape completion from a single view where our approach achieves state-of-the-art results.
  52. 52. Kinecttweaks depthresolution improvementswithpolarization measurement? http://news.mit.edu/2015/object-recognition-robots-0724 https://youtu.be/m6sStUk3UVk http://news.mit.edu/2015/algorithms-boost-3-d-imaging-resolution-1000-times-1201 https://doi.org/10.1007/s11263-017-1025-7 https://doi.org/10.1364/OE.25.001173
  53. 53. RangeSensing PlentyofOptions http://3dscanexpert.com/photogrammetry-benchmarks-r emake-vs-photoscan-vs-realitycapture-vs-zephyr/ This post is just an example based on a single photoset from a single object. That makes it zero percent scientific. Also, RealityCapture might have won this Drag Race in terms of both speed with the Fast preset and quality with the Normal preset, but an organic object like this is very favorable to its algorithms. Read my Full RC Review to see that it can’t always handle non-organic objects well. COMMERCIAL SOFTWARE http://3dscanexpert.com/ By Nick Lievendag Entrepreneur at the intersection of Creativity × Technology. Writes, Speaks and Consults about 3D Capture (3D Scanning & Photogrammetry). Founder of 3D Scan Expert.
  54. 54. Matterportdominating RealEstatescanning This $4,500 camera turns the real world into the virtual one. Today, Matterport ’s hardware is a hit with real estate agents. But fueled by the $30 million Series C it just raised, Matterport’s software and partnership with Google’s Project Tango could let you wave your phone around to create VR tours of anywhere you want. https://techcrunch.com/2015/06/25/matterport/ https://www.crunchbase.com/organization/matterport#/entity Matterport spawned out of the Xbox Kinect hacker scene in 2010. Founder Matt Bell had been working for a gesture recognition company that relied on a $50,000 camera and expert operators to produce a huge CAD file that could only be accessed through a specialized application. Bell was flabbergasted by the power of the $150 Kinect. He realized the potential for a relatively cheap device with similar technology that could let anyone map out rooms to create 3D models accessible straight from the web. https://youtu.be/HZX8RupfQls
  55. 55. MatterportResearch onsemanticindoor segmentation We collected the data using the Matterport Camera, which combines 3 structured-light sensors to capture 18 RGB and depth images during a 360 rotation at each scan location◦ . The output is the reconstructed 3D textured meshes of the scanned area, the raw RGB-D images, and camera metadata. We used this data as a basis to generate additional RGB-D data and make point clouds by sampling the meshes. We semantically annotated the data directly on the 3D point cloud, rather than images, and then projected the per point labels on the 3D mesh and the image domains. https://arxiv.org/abs/1702.01105 | Cited by 3 - Related articles https://arxiv.org/abs/1702.07600 https://www.fastcompany.com/3059281/ introducing-hover-an-ai-powered-indo or-safe-camera-drone + Indoor scanning with tripod-based Matterport still requires a lot of manual work, and at some point will be updated to autonomous AI- powered indoor drone for better user experience.
  56. 56. MatterportTechnologypatents Capturing and aligning multiple 3-dimensional sceneswww.google.com/patents/US8879828Grant - Filed Jun 29, 2012 - Issued Nov 4, 2014 - Matthew Bell - Matterport, Inc. Multi-modal method for interacting with 3d models www.google.com/patents/US20130342533App. - Filed Jun 24, 2013 - Published Dec 26, 2013 - Matthew Bell - Matterport, Inc. Identifying and filling holes across multiple aligned three-dimensional scenes www.google.com/patents/US8861840Grant - Filed Oct 14, 2013 - Issued Oct 14, 2014 - Matthew Bell - Matterport, Inc. Building a three-dimensional composite scene www.google.com/patents/US8861841Grant - Filed Oct 14, 2013 - Issued Oct 14, 2014 - Matthew Bell - Matterport, Inc. Processing and/or transmitting 3D data www.google.com/patents/US9396586Grant - Filed Mar 14, 2014 - Issued Jul 19, 2016 - Matthew Tschudy Bell - Matterport, Inc. Semantic understanding of 3d data www.google.com/patents/US20160055268App. - Filed Jun 6, 2014 - Published Feb 25, 2016 - Matthew Tschudy Bell - Matterport, Inc. Selecting two-dimensional imagery data for display within a three-dimensional model www.google.com/patents/EP3120329A1?cl=enApp. - Filed Mar 13, 2015 - Published Jan 25, 2017 - Matthew Tschudy BELL - Matterport, Classifying, separating and displaying individual stories of a three-dimensional model of a multi-story structure based on captured image data of the multi-story structure www.google.com/patents/US20160217225App. - Filed Jan 28, 2016 - Published Jul 28, 2016 - Matthew Tschudy Bell - Matterport, Inc. Semantic understanding of 3d data US 20160055268 A1 ABSTRACT Systems and techniques for processing three- dimensional (3D) data are presented. Captured three- dimensional (3D) data associated with a 3D model of an architectural environment is received and at least a portion of the captured 3D data associated with a flat surface is identified. Furthermore, missing data associated with the portion of the captured 3D data is identified and additional 3D data for the missing data is generated based on other data associated with the portion of the captured 3D data. REFERENCED BY US9576184 Textura Planswift Corporation Detection of a perimeter of a region of interest in a floor plan document US20130328872 Tekla Corporation Computer aided modeling US20150227644 Pictometry International Corp. Method and system for displaying room interiors on a floor plan US20160063722 Textura Planswift Corporation Detection of a perimeter of a region of interest in a floor plan document US20160379405 Jim S Baca Technologies for generating computer models, devices, systems, and methods utilizing the same
  57. 57. GoogleTangoTechnology http://www.deccanchronicle.com/technology/gadgets/210717/i s-google-tango-relevant-in-2017.html https://arstechnica.co.uk/gadgets/2016/12/google- tango-phab-2-pro-review/ A Project Tango device ‘sees’ the environment around it through a combination of three core functions. First up is motion tracking, which allows the device to understand its position and orientation using a range of sensors (including accelerometer and gyroscope). Then there’s depth perception, which examines the shape of the world around you. Intel provides a vital cog in this respect with its RealSense 3D camera. With this component on board, a device can gain accurate gesture control and snappy 3D object rendering among other things. Finally, Project Tango incorporates area learning, which means that it maps out and remembers the area around it. Point Cloud Framework for Rendering 3D Models Using Google Tango Maxen Chung, Santa Clara University Julian Callin, Santa Clara University http://scholarcommons.scu.edu/cseng_senior/84 https://doi.org/10.1007/s11227-016-1891-8 Project Tango Tablet Development Kit, recently introduced by Google, Inc. Equipped with the most powerful processor available to date on a consumer-level mobile platform (i.e., NVIDIA Tegra K1 whose 192 programmable CUDA-enabled GPU cores use the same efficient Kepler architecture found in the world’s most powerful supercomputers and workstations) along with several sensors (motion tracking camera, 3D depth sensor, accelerometer, ambient light sensor, barometer, compass, GPS, gyroscope), this mobile device can readily utilize GPU computing making it an ideal platform for developing real-time contextual awareness applications for the visually impaired (VI). Moreover, being compact, lightweight, potentially wearable, relatively discreet and affordable render it aesthetically appealing, socially acceptable and accessible for VI users
  58. 58. GoogleTangoExampleApplications#1 We broke the news yesterday that Google was producing a prototype 3D sensing smartphone called Project Tango. We also broke down the capabilities of the vision processor inside the device and talked about what it means for the future of phones. Now, we’ve got an exclusive look in the video below at a real 3D indoor map of a room captured with one of the prototype devices by Matterport. https://techcrunch.com/2014/02/21/heres-an-actual-3d-indoor-map-of-a-room-captured-with-googles-project-tango-phone/ https://matterport.com/mobile-3d-capture/ https://developers.google.com/tango/apis/overview Daydream is Google’s platform for virtual reality. It consists of Daydream-ready phones, Daydream-ready headsets and controllers, and Daydream apps. Daydream View is the first Daydream-ready headset and controller designed and developed by Google. It also comes with a touch-and-motion enabled controller so you can easily interact with VR apps. With the Daydream View, you will be able to explore new worlds through Google Street View and Fantastic Beasts. Kick back in your personal cinema with YouTube, Netflix, Hulu, and HBO. Get in the game with Gunjack 2, LEGO® BrickHeadz, and Need for Speed. That’s just the beginning of the VR possibilities with Daydream. http://www.techphlie.com/ 2017/07/what-is-google-ta ngo-and-daydream.html Google has notably been pushing AR/VR technologies with its latest Android OS. The most prominent introduction however, has been the ASUS ZenFone AR launch that took place at CES, 2017, earlier this year.
  59. 59. GoogleTangoExampleApplications#2 Google Tango SDK examples: how to make a floor plan in 50 seconds Alexander Grau Google Tango and Revit Leonardo Manzione https://www.youtube.com/watch?v=A-4cuJ1kOQ4
  60. 60. “GoogleTango”withoutdepth sensors I have always believed that bringing 3D to consumers could only work without the need for dedicated depth sensors. This pure-software approach is already being embraced for Augmented Reality with Apple’s upcoming ARKit and Google’s ARCore which was announced last week. Both can give modern smartphones AR-capabilities by just using the regular camera(s), instead of using dedicated sensors like Tango. https://3dscanexpert.com/sony-3d-creator-brings-sensor-less-3d-scanning-consumers/ But yesterday, at IFA Berlin, Sony announced its latest smartphone, the XZ1. Which has all the bells and whistles you expect from a flagship Android phone but also an app called 3D Creator . It basically does exactly what Microsoft showed last year, but is actually available — albeit exclusive for the XZ1. https://www.sonymobile.com/global-en/products/phones/xperia -xz1/3d-creator/
  61. 61. AppleDepthSensing TheiPhoneX’s notch isbasically aKinect 365by Paul Miller@futurepaul  Sep 17,2017, 10:00am EDT https://www.theverge.com/circuitbreaker/2017/9/17/16315510/iphone-x-notch-kinect-apple-primesense-microsoft And now, in late 2017, Apple is going to sell a phone witha front-facing depthcamera. Unlike the original Kinect, which was built to track motion in a whole living room, the sensor is primarily designed for scanning faces and powers Apple’s Face ID feature. Apple’s “TrueDepth” camera blasts “more than 30,000 invisible dots” and can create incredibly detailed scans of a human face. In fact, while Apple’s Animoji feature is impressive,  the developerAPIbehind it is even wilder: Apple generates, in real time, a full animated 3D mesh of your face, while also approximating your face’s lighting conditions to improve the realismofAR applications. How Apple’siPhone X TrueDepth CameraWorks By David Cardinal onSeptember 14, 2017 Beyond the Camera: Facial Motions and Changing Features Getting a depth estimate for portions of a scene is only the beginning of what’s required for Apple’s implementation of secure facial recognition and Animojis. For example, a mask could be used to hack a facial recognition system that relied solely on the shape of the face. So Apple is using processing power to learn and recognize 50 different facial motionsthat are muchharder toforge.Theyalso provide the basis for making Animoji figures seem to mimicthephone’sowner. How Secure is Face ID? Given how willing Apple is to commit to using Face ID for financial transactions, I’m sure they have pushed the limits beyond either simple 3D models or 2D motion. It is likely they are relying on the phone’s abilitytorecognize minute facial movements and feed them into a machine learning system on the A11Bionicchip that will add another layer of security to the system. That piece will also be key in helping the phone decide whether you’re the same person when you put on a pair of glasses, a hat, or grow a beard — all of which Apple claims Face ID willhandle.
  62. 62. Laserscanning LIDARtechnology
  63. 63. LaserScanning LiDAR(LightDetection AndRanging) http://dx.doi.org/10.1038/nphoton.2010.148 http://dx.doi.org/10.1080/19479832.2013.811124 3D building modeling (BIM) using images and LiDAR: a review https://techcrunch.com/2017/07/12/nyu-releases-the-largest-lidar- dataset-ever-to-help-urban-development/ http://ia.cr/2017/613 https://www.theregister.co.uk/2017/06/27/lidar_spoofed_bad_news_for_self_driving_cars/
  64. 64. VelodyneThemoston newsduetoautonomousdriving http://velodynelidar.com/ https://www.youtube.com/watch?v=8nTFjVm9sTQ https://www.youtube.com/watch?v=nXlqv_k4P8Q http://spectrum.ieee.org/cars-that-think/transportation/se nsors/velodyne-announces-a-solidstate-lidar http://spectrum.ieee.org/cars-that-think/transportati on/sensors/israeli-stealth-startup-innoviz-promises-1 00-solidstate-automotive-lidar-by-2018 http://spectrum.ieee.org/transportation/advanced-cars/cheap-lidar-the-k ey-to-making-selfdriving-cars-affordable
  65. 65. RieglA rangeof differentlaserscanners http://www.riegl.com/products/unmanned-scanning/ RIEGL VZ-400 Indoor Scanned Data by Jamis Choi, Published on Apr 1, 2010 https://www.youtube.com/watch?v=hOf0hpCn92I Scanning made simple with RiSOLVE - RIEGL's new 3D Scene Capture Software Published on Oct 4, 2012 (feat. horrible lounge music) https://www.youtube.com/watch?v=lbxvzMlTWyg
  66. 66. Rieglsystemin practice https://doi.org/10.1109/IROS.2016.7759501 Namely, we propose a method for the automatic selection of feature coordinate locations, and introduce the concept of localized automatic relevance determination (LARD) to the Hilbert Maps framework, in which different dimensions in the projected Hilbert space operate within independent length scale values. The proposed technique was tested against other state-of-the-art 3D scene reconstruction tools in three different datasets: a simulated indoors environment, RIEGL laser scans and dense LSD-SLAM pointclouds. The results testify to the proposed framework’s ability to model complex structures and correctly interpolate over unobserved areas of the input space while achieving real-time training and querying performances.
  67. 67. HandheldScanning GeoSLAMZEB-REVO Handheld Laser Scanning - ZEB-REVO The ZEB-REVO is the latest, lightweight revolving laser scanner from GeoSLAM. Handheld, pole-mounted or attached to a mobile platform, the ZEB-REVO can record more than 40,000 measurement points per second from the survey environment. NEW ZEB-CAM The new ZEB-CAM is an optional upgrade for standard ZEB-REVO systems. Simply attach ZEB-CAM to the underside of a standard REVO and begin scanning immediately. The ZEB-CAM captures live video footage of the survey environment and adds contextual video and imagery to scan data to aid feature identification. Optical flow technology is utilised to accurately synchronise the video and scan together in GeoSLAM's Desktop software. http://www.3dlasermapping.com/zeb-revo- handheld-laser-scanning/ https://youtu.be/k8q5xr_eLgk
  68. 68. GeoSlamvs.Leica Portablescanningquality http://dx.doi.org/10.1117/12.2270761 The paper investigates the performances of two portable mobile mapping systems (MMSs), the handheld GeoSLAM ZEB-REVO and Leica Pegasus:Backpack, in two typical user-case scenarios: an indoor two-floors building and an outdoor open city square. Note! This paper would have been even nicer with a ‘gold standard’ giving the “correct measurements” instead of just comparing two “good enough” scanners.
  69. 69. ResearchScanners SensorFusion The Indoor Multi-sensor Acquisition System (IMAS) presented in this paper consists of a wheeled platform equipped with two 2D laser heads, RGB cameras, thermographic camera, thermohygrometer, and luxmeter. One of the laser scanning sensors is foreseen to obtain the building map and the navigation information, and the other one to the 3D environment reconstruction. The thermographic and optical images, and the geometric and comfort data are synchronized and automatically linked to trajectory positions, so that they are georeferenced in the building in terms of a relativepositioning system Software interface for virtual immersive navigation and ex situ data analysis. http://dx.doi.org/10.3390/s16060785
  70. 70. AppliedPointCloud Scans Accessibility Point Clouds to Indoor/Outdoor Accessibility Diagnosis J. Balado, L. Díaz-Vilariño, P. Arias, I. Garrido https://www.isprs-ann-photogramm-remote-sens-spatial-inf-sci.net/IV-2-W4/287/2017/isprs-annals-IV-2- W4-287-2017.pdf This work presents an approach to automatically detect structural floor elements such as steps or ramps in the immediate environment of buildings, elements that may affect the accessibility to buildings. The methodology is based on Mobile Laser Scanner (MLS) point cloud and trajectory information. The methodology is tested in a real case study, consisting of 100 m of an urban street. Ground elements are correctly classified in an acceptable computation time. Steps and ramps also are exported to GIS software to enrich building models from Open Street Map with information about accessible/inaccessible entrances and their locations. http://www.wired.co.uk/article/wayfindr-app A project initiated by the Royal London Society for the Blind's (RLSB) Youth Forum has led to the prototyping of a new app called Wayfindr, which has been built especially to help blind and partially sighted people use London's transport network independently. The app relies on smartphones and iBeacons and has been developed in collaboration with global digital product design studio ustwo Our Open Standard gives you the tools to create inclusive and consistent experiences for your vision impaired customers. From transport networks and shopping centres, to hospitals and any other indoor space - we can help. Through our on-site trials and consultancy we will work together with you to understand how digital wayfinding can make your estate accessible. https://www.wayfindr.net/
  71. 71. Post-processing Rawpointcloudsaremassiveandpossiblycontain alotof redundantdatapoints
  72. 72. DataQuality compromisebetweenfilesize,computationaltimeandquality 3D model reconstruction from point cloud processed either with OpenSFM, VisualSFM or Pix4D (top row) to mesh model (middle row) to final textured 3D model (bottom row) across a series of downsampled Sky Ranger UAV including full resolution (first column) half resolution (second column) and quarter resolution (last column). Bolick and Harguess (2016), http://dx.doi.org/10.1117/12.2224677 Garbage in – garbage out true like as always. The more high-quality images / points you have as input, the higher the reconstruction quality will obviously be. Top-left: points sampled on a sphere and corrupted with a lot of noise. Top-right: reconstructed surface mesh. Bottom-left: smoothed point set. Bottom- right: reconstructed surface mesh. Reconstruction error (mm) against number of points for the Bimba con Nastrino point set with 1.6M points as well as for simplified versions. CGAL 4.10 - Poisson Surface Reconstruction The sensitivity of biological finite element models to the resolution of surface geometry: a case study of crocodilian crania: “Example of the simplified models. C. moreletti models composed of 20k, 30k, 90k and 300k surface (mesh) elements.” https://doi.org/10.7717/peerj.988 point cloud & mesh processing MAY 27 2017, posted by Taylor Wang The final goal is to get a fully editable NURBS CAD model so that it can be modified by any CAD software to improve the design or reproduce the product.
  73. 73. PointCloudLibray(PCL) The mostpopular open-sourcelibrary http://unanancyowen.com/en/pcl-with-velodyne/ https://www.youtube.com/watch?v=7BUFxkyH1r0 https://doi.org/10.1109/MRA.2012.2206675 Cited by 186 articles - see Related articles
  74. 74. Otherlibraries CGALandresearchcode
  75. 75. Driftcorrection forproperimageregistration https://doi.org/10.1109/ROBOT.2010.5509312 Correcting for drift (distortion) between different scans or overlapping point clouds with added velocity information for ICP (Iterative Closest Point) algorithm. (a) is a given environment. Blue points in (b) shows distortion of the scan, and red points in (b) show compensated scan. Transformation estimated using distorted data includes inevitable errors(c). Transformation estimated from the rectified scan gives us more accurate results(d). Kaarta - Common point cloud registration issues http://www.kaarta.com/cloud-registration-issues/ Published: 8 March 2017 http://dx.doi.org/10.3390/s17030539 Keywords: LiDAR; inertial measurement unit; iterative closest point; iterated sigma point Kalman filter; time delay calibration
  76. 76. DataReduction andsimplificationfor storage Imran Ashraf ; Soojung Hur ; Yongwan Park https://doi.org/10.1109/ACCESS.2017.2699686 LIDAR produces large point cloud, but, while generating images for limited field of view, data sparsity results in poor quality images. Moreover, 3D to 2D data transformation also involves data reduction, which further deteriorates the quality of images. http://dx.doi.org/10.1117/12.2270833 31 October 2016 https://doi.org/10.1109/TIP.2016.2623488 https://www.google.com/patents/US9582939 https://arxiv.org/abs/1609.00893 Keywords: Tensor networks, Function-related tensors, CP decomposition, Tucker models, tensor train (TT) decompositions, matrix product states (MPS), matrix product operators (MPO), basic tensor operations, multiway component analysis, multilinear blind source separation, tensor completion, linear/multilinear dimensionality reduction, large-scale optimization problems, symmetric eigenvalue decomposition (EVD), PCA/SVD, huge systems of linear equations, pseudo-inverse of very large matrices, Lasso and Canonical Correlation Analysis (CCA) https://doi.org/10.1016/j.isprsjprs.2016.06.012 In-base point cloud management pipeline in the point cloud server (PCS).
  77. 77. DataReduction CompressiongPointClouds Dynamic polygon cloud compression Eduardo Pavez ; Philip A. Chou (2017) https://doi.org/10.1109/ICASSP.2017.7952694 We introduce a compressible representation of 3D geometry (including its attributes, such as color texture) intermediate between polygonal meshes and point clouds called a polygon cloud. Polygon clouds, compared to polygonal meshes, are more robust to live capture noise and artifacts. Furthermore, dynamic polygon clouds, compared to dynamic point clouds, are easier to compress, if certain challenges are addressed. In this paper, we propose methods for compressing dynamic polygon clouds using transform coding of color and motion residuals. Real-time compression of point cloud streams Julius Kammerl ; Nico Blodow ; Radu Bogdan Rusu ; Suat Gedikli ; Michael Beetz ; Eckehard Steinbach (2012) https://doi.org/10.1109/ICRA.2012.6224647 We present a novel lossy compression approach for point cloud streams which exploits spatial and temporal redundancy within the point data. Our proposed compression framework can handle general point cloud streams of arbitrary and varying size, point order and point density. Furthermore, it allows for controlling coding complexity and coding precision. To compress the point clouds, we perform a spatial decomposition based on octree data structures. 3D Reconstruction Framework for Multiple Remote Robots on Cloud System Phuong Minh Chu, Seoungjae Cho, Simon Fong, Yong Woon Park and Kyungeun Cho (2017) http://dx.doi.org/10.3390/sym9040055 This paper proposes a cloud-based framework that optimizes the three-dimensional (3D) reconstruction of multiple types of sensor data captured from multiple remote robots. A working environment using multiple remote robots requires massive amounts of data processing in real-time, which cannot be achieved using a single computer. In the proposed framework, reconstruction is carried out in cloud-based servers via distributed data processing.
  78. 78. Data-drivenprocessing Likein allthefieldsofcomputervision,real-timescanning,post- processingandsemanticunderstandingareimprovedwith recent deeplearningandartificial intelligencetechniques
  79. 79. DeepLearningbeyondnon-euclidean problems Michael M. Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, andPierre Vandergheynst https://doi.org/10.1109/MSP.2017.2693418 https://arxiv.org/abs/1705.10819
  80. 80. DeepLearningPointclouds https://arxiv.org/abs/1704.03847 https://arxiv.org/abs/1705.03428
  81. 81. DeepLearningPointNet++ PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas Stanford University, (Submitted on 7 Jun 2017) https://arxiv.org/abs/1706.02413 Illustration of our hierarchical feature learning architecture and its application for set segmentation and classification using points in 2D Euclidean space as an example. Single scale point grouping is visualized here. Left: Point cloud with random point dropout. Right: Curve showing advantage of our density adaptive strategy in dealing with non-uniform density. DP means random input dropout during training; otherwise training is on uniformly dense points Scannet labeling results. PointNet captures the overall layout of the room correctly but fails to discover the furniture. Our approach, in contrast, is much better at segmenting objects besides the room layout.
  82. 82. DeepLearning2DFeatureDescriptors Instead of using the old-school SIFT, SURF, ORB, etc., the feature descriptor / matching can be done with data-driven deep learning network as well Note This model was trained with SfM data, which does not have strong rotation changes. Newer models work better in this case, which will be released soon. In the meantime, you can also use the models in the learn-orientation, benchmark-orientation. https://github.com/cvlab-epfl/LIFT https://arxiv.org/abs/1603.09114 | Cited by 23 Related articles
  83. 83. DeepLearning3DFeatureDescriptors https://arxiv.org/abs/1706.04496 We present a view-based convolutional network that produces local, point-based shape descriptors. The network is trained such that geometrically and semantically similar points across different 3D shapes are embedded close to each other in descriptor space (left). Our produced descriptors are quite generic — they can be used in a variety of shape analysis applications, including dense matching, prediction of human affordance regions, partial scan-to-shape matching, and shape segmentation (right). In contrast to findings in the image analysis community where learned 2D descriptors are ubiquitous and general (e.g. LIFT), learned 3D descriptors have not been as powerful as 2D counterparts because they (1) rely on limited training data originating from small-scale shape databases, (2) are computed at low spatial resolutions resulting in loss of detail sensitivity, and (3) are designed to operate on specific shape classes, such as deformable shapes. We generate training correspondences automatically by leveraging highly structured databases of consistently segmented shapes with labeled parts. The largest such database is the segmented ShapeNetCore dataset [ Yi et al. 2016, https://www.shapenet.org/] that includes 17K man-made shapes distributed in 16 categories
  84. 84. Meshgenerativeshapeswith GAN https://arxiv.org/abs/1705.02090 Our key insight is that 3D shapes are effectively characterized by their hierarchical organization of parts, which reflects fundamental intra-shape relationships such as adjacency and symmetry. We develop a recursive neural net (RvNN) based autoencoder to map a flat, unlabeled, arbitrary part layout to a compact code. The code effectively captures hierarchical structures of man-made 3D objects of varying structural complexities despite being fixed-dimensional: an associated decoder maps a code back to a full hierarchy. The learned bidirectional mapping is further tuned using an adversarial setup to yield a generative model of plausible structures, from which novel structures can be sampled. It would be interesting to thoroughly investigate the effect of code length on structure encoding. Finally, it is worth exploring recent developments in GANs, e.g. Wasserstein GAN [Arjovsky et al. 2017], in our problem setting. It would also be interesting to compare with plain VAE and other generative adaptations.
  85. 85. PointCloud generativeGANsforpointclouds #1a https://arxiv.org/abs/1707.02392 We build an end-to-end pipeline for 3D point clouds that uses an autoencoder (AE) to create a latent representation, and a Generative Adversarial Networks (GAN) to generate new samples in that latent space. Our AE is designed with a structural loss tailored to unordered point clouds. Our learned latent space, while compact, has excellent class- discriminative ability: per our classification results, it outperforms recent GAN-based representations by 4.3%. In addition, the latent space allows for vector arithmetic, which we apply in a number of shape editing scenarios, such as interpolation and structural manipulation. We argue that jointly learning the representation and training the GAN is unnecessary for our modality. We propose a workflow that first learns a representation by training an AE with a compact bottleneck layer, then trains a plain GAN in that fixed latent representation. One benefit of this approach is that AEs are a mature technology: training them is much easier and they are compatible with more architectures than GANs. We point to theory that supports this idea, and verify it empirically: we show that GANs trained in our learned AE-based latent space generate visibly improved results, even with a generator and discriminator as shallow as a single hidden layer. Within a handful of epochs, we generate geometries that are recognized in their right object class at a rate close to that of ground truth data. Importantly, we report significantly better diversity measures (10x divergence reduction) over the state of the art, establishing that we cover more of the original data distribution. In summary, we contribute. ● An effective cross-category AE-based latent representation on point clouds. ● The first (monolithic) GAN architecture operating on 3D point clouds. ● A surprisingly simpler, state-of-the-art GAN working in the AE’s latent space. 1) Autoencoder For fixed latent representation Vector arithmetic 2) Generative Adversarial Network Using the fixed latent representation In our latent-space GAN, instead of operating on the raw point cloud input, we pass the data through our pre-trained autoencoder, trained separately for each object class with the Earth Mover’s distance (EMD) loss function. Both the generator and the discriminator of the GAN then operate on the 512- dimensional bottleneck variable of the AE. Finally, once the GAN training is over, the output of the generator is decoded to a point cloud via the AE decoder. We found that very shallow designs for both the generator and discriminator (in our case, 1 hidden layer for the generator and 2 for the discriminator) are sufficient to produce realistic results
  86. 86. PointCloud generativeGANsforpointclouds #1b Interpolating between different point clouds, using our latent space representation. Note the interpolation between structurally and topologically different shapes. Generative results using our latent-space GAN. Note the variability and fidelity of the result. For a recap on GANs, you could see for example: https://arxiv.org/abs/1701.07875 Cited by 106 - Related articles What does GANs for point clouds mean in practice? Point-cloud super-resolution (e.g. Ledig et al. 2016 for natural images), to improve model appearance (e.g. remove staircasing), and inpainting (e.g. Iizuka et al. 2017) to handle occlusion and gaps from indoor scans (“shape completion”). “Visual plastic surgery” in other words (Tung et al. 2017) Sung et al. (2015) Data-driven Structural Priors for Shape Completion Mönch et al. (2010) Staircase-Aware Smoothing of Medical Surface Meshes
  87. 87. HardwarePointCloud Super-resolution multiplescans https://doi.org/10.2312/SPBG/SPBG06/009-015 Cited by 47 articles On the left, one scan of the the parrot statue, with a sample spacing of about 1mm. Center, we combine 100 nearly identical such scans to produce the surface in the center, produced on a grid with sample spacing of about 0.3mm. Notice the noise reduction and the improvement in the detail, for instance in the face, neck and wing feathers. On the right, a photograph of the parrot statue. Super-resolution reconstruction using only 30 input scans at the left and increasing to 140 at the right. Noise is reduced dramatically at the beginning but more slowly at the end. Surfaces were reconstructed from subsets which were pre-registered using all 140 scans. For absolute measurement accuracy (e.g. Biljecki et al. 2017), one can scan the same space multiple times A thin strip of the super-resolved surface, and the nearby sample points from the input scans. The input is very noisy, but the points are densely and randomly distributed near the surface with few outliers, so the average gives an accurate representation of the surface. (a) One scan. (b) Final super-resolved surface from 100 scans. (c) Photo of the object (a plaster cast of a subway token). The bottom row shows some results of other kinds of processing, to evaluate the importance of the various steps of the algorithm. (d) One scan, bilinearly interpolated onto the finer grid and smoothed. Detail is missing. (e) The entire algorithm except for the final bilateral filtering step. The noise removed by the filtering seems to be residual registration error, which perhaps could be improved. (f) Just averaging 100 scans taken without moving the scanner, using the same Gaussian kernel. Noise is decreased, but there is aliasing from the lower-resolution grid obscuring detail visible in (b).
  88. 88. DeepLearningSuper-Resolution Plentyofoptionsforimage/video/volumesuper-resolution https://arxiv.org/abs/1706.03142 https://arxiv.org/abs/1704.02738 https://arxiv.org/abs/1704.02470 https://arxiv.org/abs/1612.00085 Novel texture enhancement framework creates an HR style image that is rich in details, which can be used to restore high-frequency texture details back into the initial HR image via the style transfer algorithm. Four examples of SR results for nearest neighbor and cubic interpolation, the best-performing sparse coding, 3D- FSRCNN, and 3D-SRU-Net configurations. Arrows indicate regions in which at least one SR result mis- interprets a cell boundary or an ultrastructural feature. Scale bar 500 nm. Our method includes a sub-pixel motion compensation (SPMC) layer that can better handle inter-frame motion for this task. Our detail fusion (DF) network that can effectively fuse image details from multiple images after SPMC alignment
  89. 89. Point-cloudsuper-resolution Upsampling‘on-the-fly’toavoid“dataexplosion”? Jason Schreier 4/17/17 12:05pm Horizon Zero Dawn, Kotaku http://kotaku.com/horizon-zero-dawn-uses-all-sorts- of-clever-tricks-to-lo-1794385026 Games like this don’t just look incredible because of ‘hyper-realism’ but because their engineers use all sorts of tricks [LOD’ing, or Level of Detail; Mipmapping; frustum culling, etc.] to save memory. The engine is designed to produce models in CityGML and does so in multiple LODs. Besides the generation of multiple geometric LODs, we implement the realisation of multiple levels of spatiosemantic coherence, geometric reference variants, and indoor representations. The datasets produced by Random3Dcity are suited for several applications, as we show in this paper with documented uses. The developed engine is available under an open-source licence at Github at http://github.com/tudelft3d/Random3Dcity http://doi.org/10.5194/isprs-annals-IV-4-W1-51-2016 Filip Biljecki, Hugo Ledoux, Jantien Stoter Level of detail texture filtering with dithering and mipmaps US 5831624 A Original Assignee 3Dfx Interactive Inc https://www.google.com/patents/US5831624 Level-of-detail rendering: colors identify different subdivision levels as stated in the top left corner. Feature-Adaptive Rendering of Loop Subdivision Surfaces on Modern GPUs November 2014 DOI: 10.1007/s11390-014-1486-x ManyLoDs: Parallel Many-View Level-of-Detail Selection for Real- Time Global Illumination Matthias Hollander, Tobias Ritschel, Elmar Eisemann, Tamy Boubekeur (2011) http://dx.doi.org/10.1111/j.1467-8659.2011.01982.x
  90. 90. 3DContentgeneration VolumetricCapture Generatecontentbyscanningreal-lifescenesandobjects Kul Wadhwa's and Roddy O'Hara's Uncorporeal http://www.uncorporeal.com/ Uncorporeal: volumetric capture systems for VR & AR content creation. The team includes a technical Oscar-winner and engineering and product leadership from WETA, Google X, Lucas ILM, and Wikimedia. https://venturebeat.com/2016/10/13/pathbreaker-ventures-raises-12-milli on-to-invest-in-emerging-tech-such-as-vr-ar-and-robotics/ Ryan Gembala, founder of Pathbreaker Ventures believes connected homes and cars and autonomous vehicles will create a lot of opportunities in vertical applications for startups. And he also thinks that space technologies such as small satellites, analysis of space-captured data, consumer transport, space mining, and others are interesting. REALITYVIRTUAL.CO - A NEW ZEALAND BASED CREATIVE TECHNOLOGIES RESEARCH & DEVELOPMENT COLLECTIVE WITH AN ENTHUSIAST TOWARDS THE VISUAL REALM: ● unique post production & signal processing techniques including the development of deep learning image enhancement & automation throughout our 3D pipeline for PBR workflow ● strong emphasis on advanced robotics & autonomous operations for large data acquisition of 3D environments. 3D Scene Creation with Photogrammetry
  91. 91. 3DContentgeneration Automaticphotorealism#1 Stillcanbequitelabor-intensivetocreaterealisticcontent Get to know Rense de Boer, a technical art director from Sweden, who is not only pushing the envelope of photo-real CGI environments, but he’s doing it all in a real-time engine! Art by Rens https://news.developer.nvidia.com/artist-spotlight-creating-photorealistic-cgi-environments-in-real-time/ https://www.youtube.com/watch?v=bXouFfqSfxg One Ph.D. position (supervision by Profs Niessner and Rüdiger Westermann) is available at our chair in the area of photorealistic rendering for deep learning and online reconstruction Research in this project includes the development of photorealistic realtime rendering algorithms that can be used in deep learning applications for scene understanding, and for high-quality scalable rendering of point scans from depth sensors and RGB stereo image reconstruction. If you are interested in applying, you should have a strong background in computer science, i.e., efficient algorithms and data structures, and GPU programming, have experience implementing C/C++ algorithms, and you should be excited to work on state-of-the-art research in the 3D computer graphics. https://wwwcg.in.tum.de/group/joboffers/phd-position-photorealistic-rendering-for-deep-le arning-and-online-reconstruction.html Ph.D. Position – Photorealistic Rendering for Deep Learning and Online Reconstruction
  92. 92. 3DContentgeneration Automaticphotorealism#2 ConvertingLiDARscanstovisuallyhighquality3Dcontent Atom View is a new piece of software that allows content creators to translate real-world scans into assets for virtual environments. Not only does it aim to produce realistic results but also reduce the workflow for content creation. The standalone app takes files captured from volumetric cameras, offline graphics renderers, 360 lidar and more. Volumetric capture is a promising area of development that could one day allow content creators to skip over several of the more laborious steps of traditional 3D content creation with better results. With Atom View, users can even edit objects once they’ve been imported. https://youtu.be/YxRI_3gKP8g
  93. 93. 3DContentgeneration Styletransfer formaps Neural Networks and The Future of 3D Procedural Content Generation by Sam Snider-Held, Creative Technologist at MediaMonks, focusing on the intersection of AR, VR, AI, UX, and Style transfer output on the left, real terrain on the right. Both are planes whose vertices are being displaced by the height map texture. Now was time to create my own style transfer light field and light field renderer. I basically reimplemented Andrew Lowndes’ WebGl light field renderer in Unity. What this post demonstrates is the idea that neural network could radically change how we generate 3D content. I went with light fields because currently my GPU is not fast enough to style transfer or any other generative network at 60 FPS. But if we do get to that point, it’s entirely possible see generative neural networks become an alternative rendering pipe line to the standard rasterization approach. In this way, neural networks could generate each frame of a game in real time, based on realtime feedback from the user. But it also potentially allows for a much more powerful creative approach, for the creator and the end user. Imagine playing Gears of War, but then telling the computer “Keep the gameplay, story, and 3d models, but make it look like Zelda: Breath of the Wild.” This is how creating or playing a future gaming experience could be, all because computers now know what things “look like” and can make other things “look like” them too.
  94. 94. 3DContentgeneration from Videoto3D Production-Level Facial Performance Capture Using Deep Convolutional Neural Networks In Proceedings of SCA'17, Los Angeles, CA, USA, July 28-30, 2017 http://research.nvidia.com/publication/facial-performance-capture-deep -neural-networks Samuli Laine, Tero Karras, Timo Aila, Antti Herva (Remedy Entertainment), Shunsuke Saito (Pinscreen, University of Southern California), Ronald Yu (Pinscreen, University of Southern California), Hao Li (USC Institute for Creative Technologies, University of Southern California, Pinscreen), Jaakko Lehtinen (NVIDIA, Aalto University) NVIDIA and game developer Remedy (Alan Wake, Quantum Break) showcased their team-up solution to streamlining motion capture and animation using a deep learning neural network, running on NVIDIA’s powerful DGX-1 server. After being “trained” with information on previously produced animations, the network is able to generate sophisticated 3D facial animation from videos of live actors, greatly alleviating the time and labor burden of traditional mo-cap animation — it can even learn enough to generate facial animation from just an audio clip. The companies believe this system could eventually produce animation that’s just as good or better than traditionally produced fare. http://www.animationmagazine.net/events/siggraph-facial-animation-advances-fabri c-engine-the-french-contingent/ “We present a real-time deep learning framework for video-based facial performance capture -- the dense 3D tracking of an actor's face given a monocular video. Our pipeline begins with accurately capturing a subject using a high-end production facial capture pipeline based on multi-view stereo tracking and artist- enhanced animations. With 5-10 minutes of captured footage, we train a convolutional neural network to produce high-quality output, including self-occluded regions, from a monocular video sequence of that subject. Since this 3D facial performance capture is fully automated, our system can drastically reduce the amount of labor involved in the development of modern narrative-driven video games or films involving realistic digital doubles of actors and potentially hours of animated dialogue per character. “
  95. 95. 3DContentgeneration from Video(&Audio) toVideo Face2Face: Real-time Face Capture and Reenactment of RGB Videos Justus Thies1 Michael Zollhöfer 2 Marc Stamminger 1 Christian Theobalt 2 Matthias Nießner 3 1 University of Erlangen-Nuremberg2 Max Planck Institute for Informatics 3 Stanford University http://www.graphics.stanford.edu/~niessner/thies2016face.html https://doi.org/10.1109/CVPR.2016.262 Neural Face Editing with Intrinsic Image Disentangling Zhixin Shu, Ersin Yumer, Sunil Hadap, Kalyan Sunkavalli, Eli Shechtman, Dimitris Samaras (Submitted on 13 Apr 2017) https://arxiv.org/abs/1704.04131 University of Washington researchers have developed new algorithms that solve a thorny challenge in the field of computer vision: turning audio clips into a realistic, lip-synced video of the person speaking those words. As detailed in a paper to be presented Aug. 2 at SIGGRAPH 2017, the team successfully generated highly-realistic video of former president Barack Obama talking about terrorism, fatherhood, job creation and other topics using audio clips of those speeches and existing weekly video addresses that were originally on a different topic. Synthesizing Obama: learning lip sync from audioSupasorn Suwajanakorn, Steven M. Seitz, Ira Kemelmacher-Shlizerman ACM Transactions on Graphics (TOG), Volume 36 Issue 4, July 2017, https://doi.org/10.1145/3072959.3073640 http://www.washington.edu/news/2017/07 /11/lip-syncing-obama-new-tools-turn-a udio-clips-into-realistic-video/

×