14 research outputs found

    Asynchronous Collaborative Autoscanning with Mode Switching for Multi-Robot Scene Reconstruction

    Full text link
    When conducting autonomous scanning for the online reconstruction of unknown indoor environments, robots have to be competent at exploring scene structure and reconstructing objects with high quality. Our key observation is that different tasks demand specialized scanning properties of robots: rapid moving speed and far vision for global exploration and slow moving speed and narrow vision for local object reconstruction, which are referred as two different scanning modes: explorer and reconstructor, respectively. When requiring multiple robots to collaborate for efficient exploration and fine-grained reconstruction, the questions on when to generate and how to assign those tasks should be carefully answered. Therefore, we propose a novel asynchronous collaborative autoscanning method with mode switching, which generates two kinds of scanning tasks with associated scanning modes, i.e., exploration task with explorer mode and reconstruction task with reconstructor mode, and assign them to the robots to execute in an asynchronous collaborative manner to highly boost the scanning efficiency and reconstruction quality. The task assignment is optimized by solving a modified Multi-Depot Multiple Traveling Salesman Problem (MDMTSP). Moreover, to further enhance the collaboration and increase the efficiency, we propose a task-flow model that actives the task generation and assignment process immediately when any of the robots finish all its tasks with no need to wait for all other robots to complete the tasks assigned in the previous iteration. Extensive experiments have been conducted to show the importance of each key component of our method and the superiority over previous methods in scanning efficiency and reconstruction quality.Comment: 13pages, 12 figures, Conference: SIGGRAPH Asia 202

    Kimera-Multi: Robust, Distributed, Dense Metric-Semantic SLAM for Multi-Robot Systems

    Full text link
    This paper presents Kimera-Multi, the first multi-robot system that (i) is robust and capable of identifying and rejecting incorrect inter and intra-robot loop closures resulting from perceptual aliasing, (ii) is fully distributed and only relies on local (peer-to-peer) communication to achieve distributed localization and mapping, and (iii) builds a globally consistent metric-semantic 3D mesh model of the environment in real-time, where faces of the mesh are annotated with semantic labels. Kimera-Multi is implemented by a team of robots equipped with visual-inertial sensors. Each robot builds a local trajectory estimate and a local mesh using Kimera. When communication is available, robots initiate a distributed place recognition and robust pose graph optimization protocol based on a novel distributed graduated non-convexity algorithm. The proposed protocol allows the robots to improve their local trajectory estimates by leveraging inter-robot loop closures while being robust to outliers. Finally, each robot uses its improved trajectory estimate to correct the local mesh using mesh deformation techniques. We demonstrate Kimera-Multi in photo-realistic simulations, SLAM benchmarking datasets, and challenging outdoor datasets collected using ground robots. Both real and simulated experiments involve long trajectories (e.g., up to 800 meters per robot). The experiments show that Kimera-Multi (i) outperforms the state of the art in terms of robustness and accuracy, (ii) achieves estimation errors comparable to a centralized SLAM system while being fully distributed, (iii) is parsimonious in terms of communication bandwidth, (iv) produces accurate metric-semantic 3D meshes, and (v) is modular and can be also used for standard 3D reconstruction (i.e., without semantic labels) or for trajectory estimation (i.e., without reconstructing a 3D mesh).Comment: Accepted by IEEE Transactions on Robotics (18 pages, 15 figures

    Semantic MapNet: Building Allocentric SemanticMaps and Representations from Egocentric Views

    Full text link
    We study the task of semantic mapping - specifically, an embodied agent (a robot or an egocentric AI assistant) is given a tour of a new environment and asked to build an allocentric top-down semantic map ("what is where?") from egocentric observations of an RGB-D camera with known pose (via localization sensors). Towards this goal, we present SemanticMapNet (SMNet), which consists of: (1) an Egocentric Visual Encoder that encodes each egocentric RGB-D frame, (2) a Feature Projector that projects egocentric features to appropriate locations on a floor-plan, (3) a Spatial Memory Tensor of size floor-plan length x width x feature-dims that learns to accumulate projected egocentric features, and (4) a Map Decoder that uses the memory tensor to produce semantic top-down maps. SMNet combines the strengths of (known) projective camera geometry and neural representation learning. On the task of semantic mapping in the Matterport3D dataset, SMNet significantly outperforms competitive baselines by 4.01-16.81% (absolute) on mean-IoU and 3.81-19.69% (absolute) on Boundary-F1 metrics. Moreover, we show how to use the neural episodic memories and spatio-semantic allocentric representations build by SMNet for subsequent tasks in the same space - navigating to objects seen during the tour("Find chair") or answering questions about the space ("How many chairs did you see in the house?")
    corecore