81,334 research outputs found
SayNav: Grounding Large Language Models for Dynamic Planning to Navigation in New Environments
Semantic reasoning and dynamic planning capabilities are crucial for an
autonomous agent to perform complex navigation tasks in unknown environments.
It requires a large amount of common-sense knowledge, that humans possess, to
succeed in these tasks. We present SayNav, a new approach that leverages human
knowledge from Large Language Models (LLMs) for efficient generalization to
complex navigation tasks in unknown large-scale environments. SayNav uses a
novel grounding mechanism, that incrementally builds a 3D scene graph of the
explored environment as inputs to LLMs, for generating feasible and
contextually appropriate high-level plans for navigation. The LLM-generated
plan is then executed by a pre-trained low-level planner, that treats each
planned step as a short-distance point-goal navigation sub-task. SayNav
dynamically generates step-by-step instructions during navigation and
continuously refines future steps based on newly perceived information. We
evaluate SayNav on a new multi-object navigation task, that requires the agent
to utilize a massive amount of human knowledge to efficiently search multiple
different objects in an unknown environment. SayNav outperforms an oracle based
Point-nav baseline, achieving a success rate of 95.35% (vs 56.06% for the
baseline), under the ideal settings on this task, highlighting its ability to
generate dynamic plans for successfully locating objects in large-scale new
environments. In addition, SayNav also enables efficient generalization of
learning to navigate from simulation to real novel environments
Perseus: Randomized Point-based Value Iteration for POMDPs
Partially observable Markov decision processes (POMDPs) form an attractive
and principled framework for agent planning under uncertainty. Point-based
approximate techniques for POMDPs compute a policy based on a finite set of
points collected in advance from the agents belief space. We present a
randomized point-based value iteration algorithm called Perseus. The algorithm
performs approximate value backup stages, ensuring that in each backup stage
the value of each point in the belief set is improved; the key observation is
that a single backup may improve the value of many belief points. Contrary to
other point-based methods, Perseus backs up only a (randomly selected) subset
of points in the belief set, sufficient for improving the value of each belief
point in the set. We show how the same idea can be extended to dealing with
continuous action spaces. Experimental results show the potential of Perseus in
large scale POMDP problems
Value Propagation Networks
We present Value Propagation (VProp), a set of parameter-efficient
differentiable planning modules built on Value Iteration which can successfully
be trained using reinforcement learning to solve unseen tasks, has the
capability to generalize to larger map sizes, and can learn to navigate in
dynamic environments. We show that the modules enable learning to plan when the
environment also includes stochastic elements, providing a cost-efficient
learning system to build low-level size-invariant planners for a variety of
interactive navigation problems. We evaluate on static and dynamic
configurations of MazeBase grid-worlds, with randomly generated environments of
several different sizes, and on a StarCraft navigation scenario, with more
complex dynamics, and pixels as input.Comment: Updated to match ICLR 2019 OpenReview's versio
Multi-criteria Evolution of Neural Network Topologies: Balancing Experience and Performance in Autonomous Systems
Majority of Artificial Neural Network (ANN) implementations in autonomous
systems use a fixed/user-prescribed network topology, leading to sub-optimal
performance and low portability. The existing neuro-evolution of augmenting
topology or NEAT paradigm offers a powerful alternative by allowing the network
topology and the connection weights to be simultaneously optimized through an
evolutionary process. However, most NEAT implementations allow the
consideration of only a single objective. There also persists the question of
how to tractably introduce topological diversification that mitigates
overfitting to training scenarios. To address these gaps, this paper develops a
multi-objective neuro-evolution algorithm. While adopting the basic elements of
NEAT, important modifications are made to the selection, speciation, and
mutation processes. With the backdrop of small-robot path-planning
applications, an experience-gain criterion is derived to encapsulate the amount
of diverse local environment encountered by the system. This criterion
facilitates the evolution of genes that support exploration, thereby seeking to
generalize from a smaller set of mission scenarios than possible with
performance maximization alone. The effectiveness of the single-objective
(optimizing performance) and the multi-objective (optimizing performance and
experience-gain) neuro-evolution approaches are evaluated on two different
small-robot cases, with ANNs obtained by the multi-objective optimization
observed to provide superior performance in unseen scenarios
- …