14 research outputs found
Asynchronous Collaborative Autoscanning with Mode Switching for Multi-Robot Scene Reconstruction
When conducting autonomous scanning for the online reconstruction of unknown
indoor environments, robots have to be competent at exploring scene structure
and reconstructing objects with high quality. Our key observation is that
different tasks demand specialized scanning properties of robots: rapid moving
speed and far vision for global exploration and slow moving speed and narrow
vision for local object reconstruction, which are referred as two different
scanning modes: explorer and reconstructor, respectively. When requiring
multiple robots to collaborate for efficient exploration and fine-grained
reconstruction, the questions on when to generate and how to assign those tasks
should be carefully answered. Therefore, we propose a novel asynchronous
collaborative autoscanning method with mode switching, which generates two
kinds of scanning tasks with associated scanning modes, i.e., exploration task
with explorer mode and reconstruction task with reconstructor mode, and assign
them to the robots to execute in an asynchronous collaborative manner to highly
boost the scanning efficiency and reconstruction quality. The task assignment
is optimized by solving a modified Multi-Depot Multiple Traveling Salesman
Problem (MDMTSP). Moreover, to further enhance the collaboration and increase
the efficiency, we propose a task-flow model that actives the task generation
and assignment process immediately when any of the robots finish all its tasks
with no need to wait for all other robots to complete the tasks assigned in the
previous iteration. Extensive experiments have been conducted to show the
importance of each key component of our method and the superiority over
previous methods in scanning efficiency and reconstruction quality.Comment: 13pages, 12 figures, Conference: SIGGRAPH Asia 202
Kimera-Multi: Robust, Distributed, Dense Metric-Semantic SLAM for Multi-Robot Systems
This paper presents Kimera-Multi, the first multi-robot system that (i) is
robust and capable of identifying and rejecting incorrect inter and intra-robot
loop closures resulting from perceptual aliasing, (ii) is fully distributed and
only relies on local (peer-to-peer) communication to achieve distributed
localization and mapping, and (iii) builds a globally consistent
metric-semantic 3D mesh model of the environment in real-time, where faces of
the mesh are annotated with semantic labels. Kimera-Multi is implemented by a
team of robots equipped with visual-inertial sensors. Each robot builds a local
trajectory estimate and a local mesh using Kimera. When communication is
available, robots initiate a distributed place recognition and robust pose
graph optimization protocol based on a novel distributed graduated
non-convexity algorithm. The proposed protocol allows the robots to improve
their local trajectory estimates by leveraging inter-robot loop closures while
being robust to outliers. Finally, each robot uses its improved trajectory
estimate to correct the local mesh using mesh deformation techniques.
We demonstrate Kimera-Multi in photo-realistic simulations, SLAM benchmarking
datasets, and challenging outdoor datasets collected using ground robots. Both
real and simulated experiments involve long trajectories (e.g., up to 800
meters per robot). The experiments show that Kimera-Multi (i) outperforms the
state of the art in terms of robustness and accuracy, (ii) achieves estimation
errors comparable to a centralized SLAM system while being fully distributed,
(iii) is parsimonious in terms of communication bandwidth, (iv) produces
accurate metric-semantic 3D meshes, and (v) is modular and can be also used for
standard 3D reconstruction (i.e., without semantic labels) or for trajectory
estimation (i.e., without reconstructing a 3D mesh).Comment: Accepted by IEEE Transactions on Robotics (18 pages, 15 figures
Semantic MapNet: Building Allocentric SemanticMaps and Representations from Egocentric Views
We study the task of semantic mapping - specifically, an embodied agent (a
robot or an egocentric AI assistant) is given a tour of a new environment and
asked to build an allocentric top-down semantic map ("what is where?") from
egocentric observations of an RGB-D camera with known pose (via localization
sensors). Towards this goal, we present SemanticMapNet (SMNet), which consists
of: (1) an Egocentric Visual Encoder that encodes each egocentric RGB-D frame,
(2) a Feature Projector that projects egocentric features to appropriate
locations on a floor-plan, (3) a Spatial Memory Tensor of size floor-plan
length x width x feature-dims that learns to accumulate projected egocentric
features, and (4) a Map Decoder that uses the memory tensor to produce semantic
top-down maps. SMNet combines the strengths of (known) projective camera
geometry and neural representation learning. On the task of semantic mapping in
the Matterport3D dataset, SMNet significantly outperforms competitive baselines
by 4.01-16.81% (absolute) on mean-IoU and 3.81-19.69% (absolute) on Boundary-F1
metrics. Moreover, we show how to use the neural episodic memories and
spatio-semantic allocentric representations build by SMNet for subsequent tasks
in the same space - navigating to objects seen during the tour("Find chair") or
answering questions about the space ("How many chairs did you see in the
house?")