We present a novel approach for efficient and reliable goal-directed
long-horizon navigation for a multi-robot team in a structured, unknown
environment by predicting statistics of unknown space. Building on recent work
in learning-augmented model based planning under uncertainty, we introduce a
high-level state and action abstraction that lets us approximate the
challenging Dec-POMDP into a tractable stochastic MDP. Our Multi-Robot Learning
over Subgoals Planner (MR-LSP) guides agents towards coordinated exploration of
regions more likely to reach the unseen goal. We demonstrate improvement in
cost against other multi-robot strategies; in simulated office-like
environments, we show that our approach saves 13.29% (2 robot) and 4.6% (3
robot) average cost versus standard non-learned optimistic planning and a
learning-informed baseline.Comment: 7 pages, 7 figures, ICRA202