81,334 research outputs found

    SayNav: Grounding Large Language Models for Dynamic Planning to Navigation in New Environments

    Full text link
    Semantic reasoning and dynamic planning capabilities are crucial for an autonomous agent to perform complex navigation tasks in unknown environments. It requires a large amount of common-sense knowledge, that humans possess, to succeed in these tasks. We present SayNav, a new approach that leverages human knowledge from Large Language Models (LLMs) for efficient generalization to complex navigation tasks in unknown large-scale environments. SayNav uses a novel grounding mechanism, that incrementally builds a 3D scene graph of the explored environment as inputs to LLMs, for generating feasible and contextually appropriate high-level plans for navigation. The LLM-generated plan is then executed by a pre-trained low-level planner, that treats each planned step as a short-distance point-goal navigation sub-task. SayNav dynamically generates step-by-step instructions during navigation and continuously refines future steps based on newly perceived information. We evaluate SayNav on a new multi-object navigation task, that requires the agent to utilize a massive amount of human knowledge to efficiently search multiple different objects in an unknown environment. SayNav outperforms an oracle based Point-nav baseline, achieving a success rate of 95.35% (vs 56.06% for the baseline), under the ideal settings on this task, highlighting its ability to generate dynamic plans for successfully locating objects in large-scale new environments. In addition, SayNav also enables efficient generalization of learning to navigate from simulation to real novel environments

    Perseus: Randomized Point-based Value Iteration for POMDPs

    Full text link
    Partially observable Markov decision processes (POMDPs) form an attractive and principled framework for agent planning under uncertainty. Point-based approximate techniques for POMDPs compute a policy based on a finite set of points collected in advance from the agents belief space. We present a randomized point-based value iteration algorithm called Perseus. The algorithm performs approximate value backup stages, ensuring that in each backup stage the value of each point in the belief set is improved; the key observation is that a single backup may improve the value of many belief points. Contrary to other point-based methods, Perseus backs up only a (randomly selected) subset of points in the belief set, sufficient for improving the value of each belief point in the set. We show how the same idea can be extended to dealing with continuous action spaces. Experimental results show the potential of Perseus in large scale POMDP problems

    Value Propagation Networks

    Full text link
    We present Value Propagation (VProp), a set of parameter-efficient differentiable planning modules built on Value Iteration which can successfully be trained using reinforcement learning to solve unseen tasks, has the capability to generalize to larger map sizes, and can learn to navigate in dynamic environments. We show that the modules enable learning to plan when the environment also includes stochastic elements, providing a cost-efficient learning system to build low-level size-invariant planners for a variety of interactive navigation problems. We evaluate on static and dynamic configurations of MazeBase grid-worlds, with randomly generated environments of several different sizes, and on a StarCraft navigation scenario, with more complex dynamics, and pixels as input.Comment: Updated to match ICLR 2019 OpenReview's versio

    Multi-criteria Evolution of Neural Network Topologies: Balancing Experience and Performance in Autonomous Systems

    Full text link
    Majority of Artificial Neural Network (ANN) implementations in autonomous systems use a fixed/user-prescribed network topology, leading to sub-optimal performance and low portability. The existing neuro-evolution of augmenting topology or NEAT paradigm offers a powerful alternative by allowing the network topology and the connection weights to be simultaneously optimized through an evolutionary process. However, most NEAT implementations allow the consideration of only a single objective. There also persists the question of how to tractably introduce topological diversification that mitigates overfitting to training scenarios. To address these gaps, this paper develops a multi-objective neuro-evolution algorithm. While adopting the basic elements of NEAT, important modifications are made to the selection, speciation, and mutation processes. With the backdrop of small-robot path-planning applications, an experience-gain criterion is derived to encapsulate the amount of diverse local environment encountered by the system. This criterion facilitates the evolution of genes that support exploration, thereby seeking to generalize from a smaller set of mission scenarios than possible with performance maximization alone. The effectiveness of the single-objective (optimizing performance) and the multi-objective (optimizing performance and experience-gain) neuro-evolution approaches are evaluated on two different small-robot cases, with ANNs obtained by the multi-objective optimization observed to provide superior performance in unseen scenarios
    • …
    corecore