2,044 research outputs found

    Fine-To-Coarse Global Registration of RGB-D Scans

    Full text link
    RGB-D scanning of indoor environments is important for many applications, including real estate, interior design, and virtual reality. However, it is still challenging to register RGB-D images from a hand-held camera over a long video sequence into a globally consistent 3D model. Current methods often can lose tracking or drift and thus fail to reconstruct salient structures in large environments (e.g., parallel walls in different rooms). To address this problem, we propose a "fine-to-coarse" global registration algorithm that leverages robust registrations at finer scales to seed detection and enforcement of new correspondence and structural constraints at coarser scales. To test global registration algorithms, we provide a benchmark with 10,401 manually-clicked point correspondences in 25 scenes from the SUN3D dataset. During experiments with this benchmark, we find that our fine-to-coarse algorithm registers long RGB-D sequences better than previous methods

    Global alignment of deformable objects captured by a single RGB-D camera

    Get PDF
    We present a novel global registration method for deformable objects captured using a single RGB-D camera. Our algorithm allows objects to undergo large non-rigid deformations, and achieves high quality results without constraining the actor's pose or camera motion. We compute the deformations of all the scans simultaneously by optimizing a global alignment problem to avoid the well-known loop closure problem, and use an as-rigid-as-possible constraint to eliminate the shrinkage problem of the deformed model. To attack large scale problems, we design a coarse-to-fine multi-resolution scheme, which also avoids the optimization being trapped into local minima. The proposed method is evaluated on public datasets and real datasets captured by an RGB-D sensor. Experimental results demonstrate that the proposed method obtains better results than the state-of-the-art methods

    Global 3D non-rigid registration of deformable objects using a single RGB-D camera

    Get PDF
    We present a novel global non-rigid registration method for dynamic 3D objects. Our method allows objects to undergo large non-rigid deformations, and achieves high quality results even with substantial pose change or camera motion between views. In addition, our method does not require a template prior and uses less raw data than tracking based methods since only a sparse set of scans is needed. We compute the deformations of all the scans simultaneously by optimizing a global alignment problem to avoid the well-known loop closure problem, and use an as-rigid-as-possible constraint to eliminate the shrinkage problem of the deformed shapes, especially near open boundaries of scans. To cope with large-scale problems, we design a coarse-to-fine multi-resolution scheme, which also avoids the optimization being trapped into local minima. The proposed method is evaluated on public datasets and real datasets captured by an RGB-D sensor. Experimental results demonstrate that the proposed method obtains better results than several state-of-the-art methods

    ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans

    Full text link
    We introduce ScanComplete, a novel data-driven approach for taking an incomplete 3D scan of a scene as input and predicting a complete 3D model along with per-voxel semantic labels. The key contribution of our method is its ability to handle large scenes with varying spatial extent, managing the cubic growth in data size as scene size increases. To this end, we devise a fully-convolutional generative 3D CNN model whose filter kernels are invariant to the overall scene size. The model can be trained on scene subvolumes but deployed on arbitrarily large scenes at test time. In addition, we propose a coarse-to-fine inference strategy in order to produce high-resolution output while also leveraging large input context sizes. In an extensive series of experiments, we carefully evaluate different model design choices, considering both deterministic and probabilistic models for completion and semantic inference. Our results show that we outperform other methods not only in the size of the environments handled and processing efficiency, but also with regard to completion quality and semantic segmentation performance by a significant margin.Comment: Video: https://youtu.be/5s5s8iH0NF

    Automatic Multiview Alignment of RGB-D Range Maps of Upper Limb Anatomy

    Get PDF
    Digital representations of anatomical parts are crucial for various biomedical applications. This paper presents an automatic alignment procedure for creating accurate 3D models of upper limb anatomy using a low-cost handheld 3D scanner. The goal is to overcome the challenges associated with forearm 3D scanning, such as needing multiple views, stability requirements, and optical undercuts. While bulky and expensive multi-camera systems have been used in previous research, this study explores the feasibility of using multiple consumer RGB-D sensors for scanning human anatomies. The proposed scanner comprises three Intel® RealSenseTM D415 depth cameras assembled on a lightweight circular jig, enabling simultaneous acquisition from three viewpoints. To achieve automatic alignment, the paper introduces a procedure that extracts common key points between acquisitions deriving from different scanner poses. Relevant hand key points are detected using a neural network, which works on the RGB images captured by the depth cameras. A set of forearm key points is meanwhile identified by processing the acquired data through a specifically developed algorithm that seeks the forearm’s skeleton line. The alignment process involves automatic, rough 3D alignment and fine registration using an iterative-closest-point (ICP) algorithm expressly developed for this application. The proposed method was tested on forearm scans and compared the results obtained by a manual coarse alignment followed by an ICP algorithm for fine registration using commercial software. Deviations below 5 mm, with a mean value of 1.5 mm, were found. The obtained results are critically discussed and compared with the available implementations of published methods. The results demonstrate significant improvements to the state of the art and the potential of the proposed approach to accelerate the acquisition process and automatically register point clouds from different scanner poses without the intervention of skilled operators. This study contributes to developing effective upper limb rehabilitation frameworks and personalized biomedical applications by addressing these critical challenges

    Plane-Based Optimization of Geometry and Texture for RGB-D Reconstruction of Indoor Scenes

    Full text link
    We present a novel approach to reconstruct RGB-D indoor scene with plane primitives. Our approach takes as input a RGB-D sequence and a dense coarse mesh reconstructed by some 3D reconstruction method on the sequence, and generate a lightweight, low-polygonal mesh with clear face textures and sharp features without losing geometry details from the original scene. To achieve this, we firstly partition the input mesh with plane primitives, simplify it into a lightweight mesh next, then optimize plane parameters, camera poses and texture colors to maximize the photometric consistency across frames, and finally optimize mesh geometry to maximize consistency between geometry and planes. Compared to existing planar reconstruction methods which only cover large planar regions in the scene, our method builds the entire scene by adaptive planes without losing geometry details and preserves sharp features in the final mesh. We demonstrate the effectiveness of our approach by applying it onto several RGB-D scans and comparing it to other state-of-the-art reconstruction methods.Comment: in International Conference on 3D Vision 2018; Models and Code: see https://github.com/chaowang15/plane-opt-rgbd. arXiv admin note: text overlap with arXiv:1905.0885

    3D Face Reconstruction from Light Field Images: A Model-free Approach

    Full text link
    Reconstructing 3D facial geometry from a single RGB image has recently instigated wide research interest. However, it is still an ill-posed problem and most methods rely on prior models hence undermining the accuracy of the recovered 3D faces. In this paper, we exploit the Epipolar Plane Images (EPI) obtained from light field cameras and learn CNN models that recover horizontal and vertical 3D facial curves from the respective horizontal and vertical EPIs. Our 3D face reconstruction network (FaceLFnet) comprises a densely connected architecture to learn accurate 3D facial curves from low resolution EPIs. To train the proposed FaceLFnets from scratch, we synthesize photo-realistic light field images from 3D facial scans. The curve by curve 3D face estimation approach allows the networks to learn from only 14K images of 80 identities, which still comprises over 11 Million EPIs/curves. The estimated facial curves are merged into a single pointcloud to which a surface is fitted to get the final 3D face. Our method is model-free, requires only a few training samples to learn FaceLFnet and can reconstruct 3D faces with high accuracy from single light field images under varying poses, expressions and lighting conditions. Comparison on the BU-3DFE and BU-4DFE datasets show that our method reduces reconstruction errors by over 20% compared to recent state of the art

    Planar Odometry from a Radial Laser Scanner. A Range Flow-based Approach

    Get PDF
    In this paper we present a fast and precise method to estimate the planar motion of a lidar from consecutive range scans. For every scanned point we formulate the range flow constraint equation in terms of the sensor velocity, and minimize a robust function of the resulting geometric constraints to obtain the motion estimate. Conversely to traditional approaches, this method does not search for correspondences but performs dense scan alignment based on the scan gradients, in the fashion of dense 3D visual odometry. The minimization problem is solved in a coarse-to-fine scheme to cope with large displacements, and a smooth filter based on the covariance of the estimate is employed to handle uncertainty in unconstraint scenarios (e.g. corridors). Simulated and real experiments have been performed to compare our approach with two prominent scan matchers and with wheel odometry. Quantitative and qualitative results demonstrate the superior performance of our approach which, along with its very low computational cost (0.9 milliseconds on a single CPU core), makes it suitable for those robotic applications that require planar odometry. For this purpose, we also provide the code so that the robotics community can benefit from it.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech. Spanish Government under project DPI2014-55826-R and the grant program FPI-MICINN 2012
    corecore