585 research outputs found

    InLoc: Indoor Visual Localization with Dense Matching and View Synthesis

    Get PDF
    We seek to predict the 6 degree-of-freedom (6DoF) pose of a query photograph with respect to a large indoor 3D map. The contributions of this work are three-fold. First, we develop a new large-scale visual localization method targeted for indoor environments. The method proceeds along three steps: (i) efficient retrieval of candidate poses that ensures scalability to large-scale environments, (ii) pose estimation using dense matching rather than local features to deal with textureless indoor scenes, and (iii) pose verification by virtual view synthesis to cope with significant changes in viewpoint, scene layout, and occluders. Second, we collect a new dataset with reference 6DoF poses for large-scale indoor localization. Query photographs are captured by mobile phones at a different time than the reference 3D map, thus presenting a realistic indoor localization scenario. Third, we demonstrate that our method significantly outperforms current state-of-the-art indoor localization approaches on this new challenging data

    ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems

    Full text link
    In this paper we present ActiveStereoNet, the first deep learning solution for active stereo systems. Due to the lack of ground truth, our method is fully self-supervised, yet it produces precise depth with a subpixel precision of 1/30th1/30th of a pixel; it does not suffer from the common over-smoothing issues; it preserves the edges; and it explicitly handles occlusions. We introduce a novel reconstruction loss that is more robust to noise and texture-less patches, and is invariant to illumination changes. The proposed loss is optimized using a window-based cost aggregation with an adaptive support weight scheme. This cost aggregation is edge-preserving and smooths the loss function, which is key to allow the network to reach compelling results. Finally we show how the task of predicting invalid regions, such as occlusions, can be trained end-to-end without ground-truth. This component is crucial to reduce blur and particularly improves predictions along depth discontinuities. Extensive quantitatively and qualitatively evaluations on real and synthetic data demonstrate state of the art results in many challenging scenes.Comment: Accepted by ECCV2018, Oral Presentation, Main paper + Supplementary Material

    Multi-View Neural Surface Reconstruction with Structured Light

    Full text link
    Three-dimensional (3D) object reconstruction based on differentiable rendering (DR) is an active research topic in computer vision. DR-based methods minimize the difference between the rendered and target images by optimizing both the shape and appearance and realizing a high visual reproductivity. However, most approaches perform poorly for textureless objects because of the geometrical ambiguity, which means that multiple shapes can have the same rendered result in such objects. To overcome this problem, we introduce active sensing with structured light (SL) into multi-view 3D object reconstruction based on DR to learn the unknown geometry and appearance of arbitrary scenes and camera poses. More specifically, our framework leverages the correspondences between pixels in different views calculated by structured light as an additional constraint in the DR-based optimization of implicit surface, color representations, and camera poses. Because camera poses can be optimized simultaneously, our method realizes high reconstruction accuracy in the textureless region and reduces efforts for camera pose calibration, which is required for conventional SL-based methods. Experiment results on both synthetic and real data demonstrate that our system outperforms conventional DR- and SL-based methods in a high-quality surface reconstruction, particularly for challenging objects with textureless or shiny surfaces.Comment: Accepted by BMVC 202

    Do-It-Yourself Single Camera 3D Pointer Input Device

    Full text link
    We present a new algorithm for single camera 3D reconstruction, or 3D input for human-computer interfaces, based on precise tracking of an elongated object, such as a pen, having a pattern of colored bands. To configure the system, the user provides no more than one labelled image of a handmade pointer, measurements of its colored bands, and the camera's pinhole projection matrix. Other systems are of much higher cost and complexity, requiring combinations of multiple cameras, stereocameras, and pointers with sensors and lights. Instead of relying on information from multiple devices, we examine our single view more closely, integrating geometric and appearance constraints to robustly track the pointer in the presence of occlusion and distractor objects. By probing objects of known geometry with the pointer, we demonstrate acceptable accuracy of 3D localization.Comment: 8 pages, 6 figures, 2018 15th Conference on Computer and Robot Visio

    INVESTIGATING 3D RECONSTRUCTION OF NON-COLLABORATIVE SURFACES THROUGH PHOTOGRAMMETRY AND PHOTOMETRIC STEREO

    Get PDF
    Abstract. 3D digital reconstruction techniques are extensively used for quality control purposes. Among them, photogrammetry and photometric stereo methods have been for a long time used with success in several application fields. However, generating highly-detailed and reliable micro-measurements of non-collaborative surfaces is still an open issue. In these cases, photogrammetry can provide accurate low-frequency 3D information, whereas it struggles to extract reliable high-frequency details. Conversely, photometric stereo can recover a very detailed surface topography, although global surface deformation is often present. In this paper, we present the preliminary results of an ongoing project aiming to combine photogrammetry and photometric stereo in a synergetic fusion of the two techniques. Particularly, hereafter, we introduce the main concept design behind an image acquisition system we developed to capture images from different positions and under different lighting conditions as required by photogrammetry and photometric stereo techniques. We show the benefit of such a combination through some experimental tests. The experiments showed that the proposed method recovers the surface topography at the same high-resolution achievable with photometric stereo while preserving the photogrammetric accuracy. Furthermore, we exploit light directionality and multiple light sources to improve the quality of dense image matching in poorly textured surfaces

    RGBDTAM: A Cost-Effective and Accurate RGB-D Tracking and Mapping System

    Full text link
    Simultaneous Localization and Mapping using RGB-D cameras has been a fertile research topic in the latest decade, due to the suitability of such sensors for indoor robotics. In this paper we propose a direct RGB-D SLAM algorithm with state-of-the-art accuracy and robustness at a los cost. Our experiments in the RGB-D TUM dataset [34] effectively show a better accuracy and robustness in CPU real time than direct RGB-D SLAM systems that make use of the GPU. The key ingredients of our approach are mainly two. Firstly, the combination of a semi-dense photometric and dense geometric error for the pose tracking (see Figure 1), which we demonstrate to be the most accurate alternative. And secondly, a model of the multi-view constraints and their errors in the mapping and tracking threads, which adds extra information over other approaches. We release the open-source implementation of our approach 1 . The reader is referred to a video with our results 2 for a more illustrative visualization of its performance
    corecore