19,863 research outputs found

    Learning Deployable Navigation Policies at Kilometer Scale from a Single Traversal

    Full text link
    Model-free reinforcement learning has recently been shown to be effective at learning navigation policies from complex image input. However, these algorithms tend to require large amounts of interaction with the environment, which can be prohibitively costly to obtain on robots in the real world. We present an approach for efficiently learning goal-directed navigation policies on a mobile robot, from only a single coverage traversal of recorded data. The navigation agent learns an effective policy over a diverse action space in a large heterogeneous environment consisting of more than 2km of travel, through buildings and outdoor regions that collectively exhibit large variations in visual appearance, self-similarity, and connectivity. We compare pretrained visual encoders that enable precomputation of visual embeddings to achieve a throughput of tens of thousands of transitions per second at training time on a commodity desktop computer, allowing agents to learn from millions of trajectories of experience in a matter of hours. We propose multiple forms of computationally efficient stochastic augmentation to enable the learned policy to generalise beyond these precomputed embeddings, and demonstrate successful deployment of the learned policy on the real robot without fine tuning, despite environmental appearance differences at test time. The dataset and code required to reproduce these results and apply the technique to other datasets and robots is made publicly available at rl-navigation.github.io/deployable

    Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments

    Full text link
    A robot that can carry out a natural-language instruction has been a dream since before the Jetsons cartoon series imagined a life of leisure mediated by a fleet of attentive robot helpers. It is a dream that remains stubbornly distant. However, recent advances in vision and language methods have made incredible progress in closely related areas. This is significant because a robot interpreting a natural-language navigation instruction on the basis of what it sees is carrying out a vision and language process that is similar to Visual Question Answering. Both tasks can be interpreted as visually grounded sequence-to-sequence translation problems, and many of the same methods are applicable. To enable and encourage the application of vision and language methods to the problem of interpreting visually-grounded navigation instructions, we present the Matterport3D Simulator -- a large-scale reinforcement learning environment based on real imagery. Using this simulator, which can in future support a range of embodied vision and language tasks, we provide the first benchmark dataset for visually-grounded natural language navigation in real buildings -- the Room-to-Room (R2R) dataset.Comment: CVPR 2018 Spotlight presentatio

    Effects of automation on situation awareness in controlling robot teams

    Get PDF
    Declines in situation awareness (SA) often accompany automation. Some of these effects have been characterized as out-of-the-loop, complacency, and automation bias. Increasing autonomy in multi-robot control might be expected to produce similar declines in operators’ SA. In this paper we review a series of experiments in which automation is introduced in controlling robot teams. Automating path planning at a foraging task improved both target detection and localization which is closely tied to SA. Timing data, however, suggested small declines in SA for robot location and pose. Automation of image acquisition, by contrast, led to poorer localization. Findings are discussed and alternative explanations involving shifts in strategy proposed

    Simultaneous localization and map-building using active vision

    No full text
    An active approach to sensing can provide the focused measurement capability over a wide field of view which allows correctly formulated Simultaneous Localization and Map-Building (SLAM) to be implemented with vision, permitting repeatable long-term localization using only naturally occurring, automatically-detected features. In this paper, we present the first example of a general system for autonomous localization using active vision, enabled here by a high-performance stereo head, addressing such issues as uncertainty-based measurement selection, automatic map-maintenance, and goal-directed steering. We present varied real-time experiments in a complex environment.Published versio
    corecore