5,935 research outputs found

    Building with Drones: Accurate 3D Facade Reconstruction using MAVs

    Full text link
    Automatic reconstruction of 3D models from images using multi-view Structure-from-Motion methods has been one of the most fruitful outcomes of computer vision. These advances combined with the growing popularity of Micro Aerial Vehicles as an autonomous imaging platform, have made 3D vision tools ubiquitous for large number of Architecture, Engineering and Construction applications among audiences, mostly unskilled in computer vision. However, to obtain high-resolution and accurate reconstructions from a large-scale object using SfM, there are many critical constraints on the quality of image data, which often become sources of inaccuracy as the current 3D reconstruction pipelines do not facilitate the users to determine the fidelity of input data during the image acquisition. In this paper, we present and advocate a closed-loop interactive approach that performs incremental reconstruction in real-time and gives users an online feedback about the quality parameters like Ground Sampling Distance (GSD), image redundancy, etc on a surface mesh. We also propose a novel multi-scale camera network design to prevent scene drift caused by incremental map building, and release the first multi-scale image sequence dataset as a benchmark. Further, we evaluate our system on real outdoor scenes, and show that our interactive pipeline combined with a multi-scale camera network approach provides compelling accuracy in multi-view reconstruction tasks when compared against the state-of-the-art methods.Comment: 8 Pages, 2015 IEEE International Conference on Robotics and Automation (ICRA '15), Seattle, WA, US

    Evaluating Point Cloud Quality via Transformational Complexity

    Full text link
    Full-reference point cloud quality assessment (FR-PCQA) aims to infer the quality of distorted point clouds with available references. Merging the research of cognitive science and intuition of the human visual system (HVS), the difference between the expected perceptual result and the practical perception reproduction in the visual center of the cerebral cortex indicates the subjective quality degradation. Therefore in this paper, we try to derive the point cloud quality by measuring the complexity of transforming the distorted point cloud back to its reference, which in practice can be approximated by the code length of one point cloud when the other is given. For this purpose, we first segment the reference and the distorted point cloud into a series of local patch pairs based on one 3D Voronoi diagram. Next, motivated by the predictive coding theory, we utilize one space-aware vector autoregressive (SA-VAR) model to encode the geometry and color channels of each reference patch in cases with and without the distorted patch, respectively. Specifically, supposing that the residual errors follow the multi-variate Gaussian distributions, we calculate the self-complexity of the reference and the transformational complexity between the reference and the distorted sample via covariance matrices. Besides the complexity terms, the prediction terms generated by SA-VAR are introduced as one auxiliary feature to promote the final quality prediction. Extensive experiments on five public point cloud quality databases demonstrate that the transformational complexity based distortion metric (TCDM) produces state-of-the-art (SOTA) results, and ablation studies have further shown that our metric can be generalized to various scenarios with consistent performance by examining its key modules and parameters

    Point cloud geometry compression using neural implicit representations

    Get PDF
    openIn recent years, the increasing prominence of 3D point clouds in various applications has led to an escalating need for efficient storage and transmission methods. The sheer size of these point cloud datasets presents challenges in rendering, transmission, and general usability. This thesis introduces a novel approach to point cloud geometry compression leveraging neural implicit representations, specifically through the use of a DiGS network model. By training this model on a single point cloud, we achieve a compact neural representation of its geometry. Notably, this representation allows for the reconstruction of the point cloud with an arbitrary resolution. After training a reconstructing network, dynamic quantization is applied on the trained weights, significantly reducing its overall bitrate without strongly compromising the quality of the reconstructed point cloud. A dequantization is then used to rebuild a high-fidelity representation of the original point cloud. Our experimental results demonstrate the efficacy of this approach in terms of compression ratios and reconstruction quality, assessed using PSNR relative to the bitrate. This research provides a promising direction for efficient point cloud geometry storage and transmission, addressing some of the growing demands of the 3D data era

    Micro Fourier Transform Profilometry (μ\muFTP): 3D shape measurement at 10,000 frames per second

    Full text link
    Recent advances in imaging sensors and digital light projection technology have facilitated a rapid progress in 3D optical sensing, enabling 3D surfaces of complex-shaped objects to be captured with improved resolution and accuracy. However, due to the large number of projection patterns required for phase recovery and disambiguation, the maximum fame rates of current 3D shape measurement techniques are still limited to the range of hundreds of frames per second (fps). Here, we demonstrate a new 3D dynamic imaging technique, Micro Fourier Transform Profilometry (μ\muFTP), which can capture 3D surfaces of transient events at up to 10,000 fps based on our newly developed high-speed fringe projection system. Compared with existing techniques, μ\muFTP has the prominent advantage of recovering an accurate, unambiguous, and dense 3D point cloud with only two projected patterns. Furthermore, the phase information is encoded within a single high-frequency fringe image, thereby allowing motion-artifact-free reconstruction of transient events with temporal resolution of 50 microseconds. To show μ\muFTP's broad utility, we use it to reconstruct 3D videos of 4 transient scenes: vibrating cantilevers, rotating fan blades, bullet fired from a toy gun, and balloon's explosion triggered by a flying dart, which were previously difficult or even unable to be captured with conventional approaches.Comment: This manuscript was originally submitted on 30th January 1

    Real Time Structured Light and Applications

    Get PDF

    Increasing the Efficiency of 6-DoF Visual Localization Using Multi-Modal Sensory Data

    Full text link
    Localization is a key requirement for mobile robot autonomy and human-robot interaction. Vision-based localization is accurate and flexible, however, it incurs a high computational burden which limits its application on many resource-constrained platforms. In this paper, we address the problem of performing real-time localization in large-scale 3D point cloud maps of ever-growing size. While most systems using multi-modal information reduce localization time by employing side-channel information in a coarse manner (eg. WiFi for a rough prior position estimate), we propose to inter-weave the map with rich sensory data. This multi-modal approach achieves two key goals simultaneously. First, it enables us to harness additional sensory data to localise against a map covering a vast area in real-time; and secondly, it also allows us to roughly localise devices which are not equipped with a camera. The key to our approach is a localization policy based on a sequential Monte Carlo estimator. The localiser uses this policy to attempt point-matching only in nodes where it is likely to succeed, significantly increasing the efficiency of the localization process. The proposed multi-modal localization system is evaluated extensively in a large museum building. The results show that our multi-modal approach not only increases the localization accuracy but significantly reduces computational time.Comment: Presented at IEEE-RAS International Conference on Humanoid Robots (Humanoids) 201
    corecore