177 research outputs found

    Integration of Absolute Orientation Measurements in the KinectFusion Reconstruction pipeline

    Full text link
    In this paper, we show how absolute orientation measurements provided by low-cost but high-fidelity IMU sensors can be integrated into the KinectFusion pipeline. We show that integration improves both runtime, robustness and quality of the 3D reconstruction. In particular, we use this orientation data to seed and regularize the ICP registration technique. We also present a technique to filter the pairs of 3D matched points based on the distribution of their distances. This filter is implemented efficiently on the GPU. Estimating the distribution of the distances helps control the number of iterations necessary for the convergence of the ICP algorithm. Finally, we show experimental results that highlight improvements in robustness, a speed-up of almost 12%, and a gain in tracking quality of 53% for the ATE metric on the Freiburg benchmark.Comment: CVPR Workshop on Visual Odometry and Computer Vision Applications Based on Location Clues 201

    Large Scale 3D Mapping of Indoor Environments Using a Handheld RGBD Camera

    Get PDF
    The goal of this research is to investigate the problem of reconstructing a 3D representation of an environment, of arbitrary size, using a handheld color and depth (RGBD) sensor. The focus of this dissertation is to examine four of the underlying subproblems to this system: camera tracking, loop closure, data storage, and integration. First, a system for 3D reconstruction of large indoor planar environments with data captured from an RGBD sensor mounted on a mobile robotic platform is presented. An algorithm for constructing nearly drift-free 3D occupancy grids of large indoor environments in an online manner is also presented. This approach combines data from an odometry sensor with output from a visual registration algorithm, and it enforces a Manhattan world constraint by utilizing factor graphs to produce an accurate online estimate of the trajectory of the mobile robotic platform. Through several experiments in environments with varying sizes and construction it is shown that this method reduces rotational and translational drift significantly without performing any loop closing techniques. In addition the advantages and limitations of an octree data structure representation of a 3D environment is examined. Second, the problem of sensor tracking, specifically the use of the KinectFusion algorithm to align two subsequent point clouds generated by an RGBD sensor, is studied. A method to overcome a significant limitation of the Iterative Closest Point (ICP) algorithm used in KinectFusion is proposed, namely, its sole reliance upon geometric information. The proposed method uses both geometric and color information in a direct manner that uses all the data in order to accurately estimate camera pose. Data association is performed by computing a warp between the two color images associated with two RGBD point clouds using the Lucas-Kanade algorithm. A subsequent step then estimates the transformation between the point clouds using either a point-to-point or point-to-plane error metric. Scenarios in which each of these metrics fails are described, and a normal covariance test for automatically selecting between them is proposed. Together, Lucas-Kanade data association (LKDA) along with covariance testing enables robust camera tracking through areas of low geometrical features, while at the same time retaining accuracy in environments in which the existing ICP technique succeeds. Experimental results on several publicly available datasets demonstrate the improved performance both qualitatively and quantitatively. Third, the choice of state space in the context of performing loop closure is revisited. Although a relative state space has been discounted by previous authors, it is shown that such a state space is actually extremely powerful, able to achieve recognizable results after just one iteration. The power behind the technique is that changing the orientation of one node is able to affect other nodes. At the same time, the approach --- which is referred to as Pose Optimization using a Relative State Space (POReSS) --- is fast because, like the more popular incremental state space, the Jacobian never needs to be explicitly computed. Furthermore, it is shown that while POReSS is able to quickly compute a solution near the global optimum, it is not precise enough to perform the fine adjustments necessary to achieve acceptable results. As a result, a method to augment POReSS with a fast variant of Gauss-Seidel --- which is referred to as Graph-Seidel --- on a global state space to allow the solution to settle closer to the global minimum is proposed. Through a set of experiments, it is shown that this combination of POReSS and Graph-Seidel is not only faster but achieves a lower residual than other non-linear algebra techniques. Moreover, unlike the linear algebra-based techniques, it is shown that this approach scales to very large graphs. In addition to revisiting the idea of using a relative state space, the benefits of only optimizing the rotational components of a trajectory in order to perform loop closing is examined (rPOReSS). Finally, an incremental implementation of the rotational optimization is proposed (irPOReSS)

    Real-time High Resolution Fusion of Depth Maps on GPU

    Full text link
    A system for live high quality surface reconstruction using a single moving depth camera on a commodity hardware is presented. High accuracy and real-time frame rate is achieved by utilizing graphics hardware computing capabilities via OpenCL and by using sparse data structure for volumetric surface representation. Depth sensor pose is estimated by combining serial texture registration algorithm with iterative closest points algorithm (ICP) aligning obtained depth map to the estimated scene model. Aligned surface is then fused into the scene. Kalman filter is used to improve fusion quality. Truncated signed distance function (TSDF) stored as block-based sparse buffer is used to represent surface. Use of sparse data structure greatly increases accuracy of scanned surfaces and maximum scanning area. Traditional GPU implementation of volumetric rendering and fusion algorithms were modified to exploit sparsity to achieve desired performance. Incorporation of texture registration for sensor pose estimation and Kalman filter for measurement integration improved accuracy and robustness of scanning process

    A comparative study of breast surface reconstruction for aesthetic outcome assessment

    Get PDF
    Breast cancer is the most prevalent cancer type in women, and while its survival rate is generally high the aesthetic outcome is an increasingly important factor when evaluating different treatment alternatives. 3D scanning and reconstruction techniques offer a flexible tool for building detailed and accurate 3D breast models that can be used both pre-operatively for surgical planning and post-operatively for aesthetic evaluation. This paper aims at comparing the accuracy of low-cost 3D scanning technologies with the significantly more expensive state-of-the-art 3D commercial scanners in the context of breast 3D reconstruction. We present results from 28 synthetic and clinical RGBD sequences, including 12 unique patients and an anthropomorphic phantom demonstrating the applicability of low-cost RGBD sensors to real clinical cases. Body deformation and homogeneous skin texture pose challenges to the studied reconstruction systems. Although these should be addressed appropriately if higher model quality is warranted, we observe that low-cost sensors are able to obtain valuable reconstructions comparable to the state-of-the-art within an error margin of 3 mm.Comment: This paper has been accepted to MICCAI201

    A survey on real-time 3D scene reconstruction with SLAM methods in embedded systems

    Full text link
    The 3D reconstruction of simultaneous localization and mapping (SLAM) is an important topic in the field for transport systems such as drones, service robots and mobile AR/VR devices. Compared to a point cloud representation, the 3D reconstruction based on meshes and voxels is particularly useful for high-level functions, like obstacle avoidance or interaction with the physical environment. This article reviews the implementation of a visual-based 3D scene reconstruction pipeline on resource-constrained hardware platforms. Real-time performances, memory management and low power consumption are critical for embedded systems. A conventional SLAM pipeline from sensors to 3D reconstruction is described, including the potential use of deep learning. The implementation of advanced functions with limited resources is detailed. Recent systems propose the embedded implementation of 3D reconstruction methods with different granularities. The trade-off between required accuracy and resource consumption for real-time localization and reconstruction is one of the open research questions identified and discussed in this paper

    Scene Reconstruction And Understanding By RGB-D Sensors

    Get PDF
    This thesis investigates interactive scene reconstruction and understanding using RGB-D data only. Indeed, we believe that depth cameras will still be in the near future a cheap and low-power 3D sensing alternative suitable for mobile devices too. Therefore, our contributions build on top of state-of-the-art approaches to achieve advances in three main challenging scenarios, namely mobile mapping, large scale surface reconstruction and semantic modeling. First, we will describe an effective approach dealing with Simultaneous Localization And Mapping (SLAM) on platforms with limited resources, such as a tablet device. Unlike previous methods, dense reconstruction is achieved by reprojection of RGB-D frames, while local consistency is maintained by deploying relative bundle adjustment principles. We will show quantitative results comparing our technique to the state-of-the-art as well as detailed reconstruction of various environments ranging from rooms to small apartments. Then, we will address large scale surface modeling from depth maps exploiting parallel GPU computing. We will develop a real-time camera tracking method based on the popular KinectFusion system and an online surface alignment technique capable of counteracting drift errors and closing small loops. We will show very high quality meshes outperforming existing methods on publicly available datasets as well as on data recorded with our RGB-D camera even in complete darkness. Finally, we will move to our Semantic Bundle Adjustment framework to effectively combine object detection and SLAM in a unified system. Though the mathematical framework we will describe does not restrict to a particular sensing technology, in the experimental section we will refer, again, only to RGB-D sensing. We will discuss successful implementations of our algorithm showing the benefit of a joint object detection, camera tracking and environment mapping

    Semantic Slam: A New Paradigm for Object Recognition and Scene Reconstruction

    Get PDF
    Simultaneous localisation and mapping (SLAM) is a technique studied in computer vision and robotics that, given measurements obtained from one or more sensors, allows incremental building of a map of the environment and simultaneous estimation of the position and orientation of the very same sensor used to acquire the input data. Visual SLAM systems typically allow the generation of accurate reconstructions of the explored environment but, until very recently, did not provide high level informations on the contents of the reconstructed scenes, useful to foster high level reasoning by subsequent algorithms. In this thesis we focus on the topic of Semantic SLAM, proposing techniques to obtain semantically accurate reconstructions of the explored environment by combining efficient SLAM systems with state-of-the-art semantic image segmentation algorithms. We show how, by relying on such semantic reconstructions, the accuracy of the localisation phase of a SLAM pipeline can improve, by accounting for the presence of semantic informations during the camera pose estimation step. We thus realise a "semantic loop", where the availability of high level clues betters the mapping process, in turn helping the subsequent localisation phase. A full system, drawing inspiration from the presented research, allowing a real-time and automatic semantic mapping of large-scale environments is then presented. An ancillary, but nevertheless important, component of simultaneous localisation and mapping systems is a technique to allow the estimation of sensor position separately from the main SLAM loop, to recover from failures in the localisation algorithm. We present a technique that, by exploiting the appearance of image patches, can reliably localise the likely position of the sensor used to acquire such images. Such relocalisation system can be easily included in a Semantic SLAM system to allow a more robust mapping process wherein camera tracking failures can be reliably recovered from

    Multi-resolution mapping and planning for UAV navigation in attitude constrained environments

    Get PDF
    In this thesis we aim to bridge the gap between high quality map reconstruction and Unmanned Aerial Vehicles (UAVs) SE(3) motion planning in challenging environments with narrow openings, such as disaster areas, which requires attitude to be considered. We propose an efficient system that leverages the concept of adaptive-resolution volumetric mapping, which naturally integrates with the hierarchical decomposition of space in an octree data structure. Instead of a Truncated Signed Distance Function (TSDF), we adopt mapping of occupancy probabilities in log-odds representation, which allows representation of both surfaces, as well as the entire free, i.e.\ observed space, as opposed to unobserved space. We introduce a method for choosing resolution -on the fly- in real-time by means of a multi-scale max-min pooling of the input depth image. The notion of explicit free space mapping paired with the spatial hierarchy in the data structure, as well as map resolution, allows for collision queries, as needed for robot motion planning, at unprecedented speed. Our mapping strategy supports pinhole cameras as well as spherical sensor models. Additionally, we introduce a first-of-a-kind global minimum cost path search method based on A* that considers attitude along the path. State-of-the-art methods incorporate attitude only in the refinement stage. To make the problem tractable, our method exploits an adaptive and coarse-to-fine approach using global and local A* runs, plus an efficient method to introduce the UAV attitude in the process. We integrate our method with an SE(3) trajectory optimisation method based on a safe-flight-corridor, yielding a complete path planning pipeline. We quantitatively evaluate our mapping strategy in terms of mapping accuracy, memory, runtime performance, and planning performance showing improvements over the state-of-the-art, particularly in cases requiring high resolution maps. Furthermore, extensive evaluation is undertaken using the AirSim flight simulator under closed loop control in a set of randomised maps, allowing us to quantitatively assess our path initialisation method. We show that it achieves significantly higher success rates than the baselines, at a reduced computational burden.Open Acces
    corecore