31,576 research outputs found

    Efficient Online Surface Correction for Real-time Large-Scale 3D Reconstruction

    Full text link
    State-of-the-art methods for large-scale 3D reconstruction from RGB-D sensors usually reduce drift in camera tracking by globally optimizing the estimated camera poses in real-time without simultaneously updating the reconstructed surface on pose changes. We propose an efficient on-the-fly surface correction method for globally consistent dense 3D reconstruction of large-scale scenes. Our approach uses a dense Visual RGB-D SLAM system that estimates the camera motion in real-time on a CPU and refines it in a global pose graph optimization. Consecutive RGB-D frames are locally fused into keyframes, which are incorporated into a sparse voxel hashed Signed Distance Field (SDF) on the GPU. On pose graph updates, the SDF volume is corrected on-the-fly using a novel keyframe re-integration strategy with reduced GPU-host streaming. We demonstrate in an extensive quantitative evaluation that our method is up to 93% more runtime efficient compared to the state-of-the-art and requires significantly less memory, with only negligible loss of surface quality. Overall, our system requires only a single GPU and allows for real-time surface correction of large environments.Comment: British Machine Vision Conference (BMVC), London, September 201

    Efficient Online Surface Correction for Real-time Large-Scale 3D Reconstruction

    Full text link
    State-of-the-art methods for large-scale 3D reconstruction from RGB-D sensors usually reduce drift in camera tracking by globally optimizing the estimated camera poses in real-time without simultaneously updating the reconstructed surface on pose changes. We propose an efficient on-the-fly surface correction method for globally consistent dense 3D reconstruction of large-scale scenes. Our approach uses a dense Visual RGB-D SLAM system that estimates the camera motion in real-time on a CPU and refines it in a global pose graph optimization. Consecutive RGB-D frames are locally fused into keyframes, which are incorporated into a sparse voxel hashed Signed Distance Field (SDF) on the GPU. On pose graph updates, the SDF volume is corrected on-the-fly using a novel keyframe re-integration strategy with reduced GPU-host streaming. We demonstrate in an extensive quantitative evaluation that our method is up to 93% more runtime efficient compared to the state-of-the-art and requires significantly less memory, with only negligible loss of surface quality. Overall, our system requires only a single GPU and allows for real-time surface correction of large environments.Comment: British Machine Vision Conference (BMVC), London, September 201

    GO-SLAM: Global Optimization for Consistent 3D Instant Reconstruction

    Full text link
    Neural implicit representations have recently demonstrated compelling results on dense Simultaneous Localization And Mapping (SLAM) but suffer from the accumulation of errors in camera tracking and distortion in the reconstruction. Purposely, we present GO-SLAM, a deep-learning-based dense visual SLAM framework globally optimizing poses and 3D reconstruction in real-time. Robust pose estimation is at its core, supported by efficient loop closing and online full bundle adjustment, which optimize per frame by utilizing the learned global geometry of the complete history of input frames. Simultaneously, we update the implicit and continuous surface representation on-the-fly to ensure global consistency of 3D reconstruction. Results on various synthetic and real-world datasets demonstrate that GO-SLAM outperforms state-of-the-art approaches at tracking robustness and reconstruction accuracy. Furthermore, GO-SLAM is versatile and can run with monocular, stereo, and RGB-D input.Comment: ICCV 2023. Code: https://github.com/youmi-zym/GO-SLAM - Project Page: https://youmi-zym.github.io/projects/GO-SLAM

    The neural network behind the eyes of a fly

    Get PDF
    With its regular, almost crystal-like structure, the fly optic lobe represents a particularly beautiful piece of nervous system, which consequently has attracted the attention of many researchers over the years. While the anatomy of the various cell types had been known from Golgi studies for long, their visual response properties could only recently be revealed thanks to the advent of cell-specific driver lines and genetically encoded indicators of neural activity. Furthermore, dense EM reconstruction of several columns of the fly optic lobe now provides information about the synaptic connections between the different cell types, and RNA sequencing sheds light on the transmitter systems and ionic conductances used for communication between them. Together with the molecular tools allowing for blocking and activating individual, genetically targeted cell types, the fly optic lobe can soon be one of the best-understood visual neuropils in neuroscience. In this review, we summarize what we have learned so far, and discuss the major difficulties that keep us from a complete understanding of visual processing in the fly optic lobe

    Active Image-based Modeling with a Toy Drone

    Full text link
    Image-based modeling techniques can now generate photo-realistic 3D models from images. But it is up to users to provide high quality images with good coverage and view overlap, which makes the data capturing process tedious and time consuming. We seek to automate data capturing for image-based modeling. The core of our system is an iterative linear method to solve the multi-view stereo (MVS) problem quickly and plan the Next-Best-View (NBV) effectively. Our fast MVS algorithm enables online model reconstruction and quality assessment to determine the NBVs on the fly. We test our system with a toy unmanned aerial vehicle (UAV) in simulated, indoor and outdoor experiments. Results show that our system improves the efficiency of data acquisition and ensures the completeness of the final model.Comment: To be published on International Conference on Robotics and Automation 2018, Brisbane, Australia. Project Page: https://huangrui815.github.io/active-image-based-modeling/ The author's personal page: http://www.sfu.ca/~rha55

    NICP: Dense normal based point cloud registration

    Get PDF
    In this paper we present a novel on-line method to recursively align point clouds. By considering each point together with the local features of the surface (normal and curvature), our method takes advantage of the 3D structure around the points for the determination of the data association between two clouds. The algorithm relies on a least squares formulation of the alignment problem, that minimizes an error metric depending on these surface characteristics. We named the approach Normal Iterative Closest Point (NICP in short). Extensive experiments on publicly available benchmark data show that NICP outperforms other state-of-the-art approaches

    Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos

    Full text link
    Learning to predict scene depth from RGB inputs is a challenging task both for indoor and outdoor robot navigation. In this work we address unsupervised learning of scene depth and robot ego-motion where supervision is provided by monocular videos, as cameras are the cheapest, least restrictive and most ubiquitous sensor for robotics. Previous work in unsupervised image-to-depth learning has established strong baselines in the domain. We propose a novel approach which produces higher quality results, is able to model moving objects and is shown to transfer across data domains, e.g. from outdoors to indoor scenes. The main idea is to introduce geometric structure in the learning process, by modeling the scene and the individual objects; camera ego-motion and object motions are learned from monocular videos as input. Furthermore an online refinement method is introduced to adapt learning on the fly to unknown domains. The proposed approach outperforms all state-of-the-art approaches, including those that handle motion e.g. through learned flow. Our results are comparable in quality to the ones which used stereo as supervision and significantly improve depth prediction on scenes and datasets which contain a lot of object motion. The approach is of practical relevance, as it allows transfer across environments, by transferring models trained on data collected for robot navigation in urban scenes to indoor navigation settings. The code associated with this paper can be found at https://sites.google.com/view/struct2depth.Comment: Thirty-Third AAAI Conference on Artificial Intelligence (AAAI'19
    • …
    corecore