10 research outputs found
Learning Deployable Navigation Policies at Kilometer Scale from a Single Traversal
Model-free reinforcement learning has recently been shown to be effective at
learning navigation policies from complex image input. However, these
algorithms tend to require large amounts of interaction with the environment,
which can be prohibitively costly to obtain on robots in the real world. We
present an approach for efficiently learning goal-directed navigation policies
on a mobile robot, from only a single coverage traversal of recorded data. The
navigation agent learns an effective policy over a diverse action space in a
large heterogeneous environment consisting of more than 2km of travel, through
buildings and outdoor regions that collectively exhibit large variations in
visual appearance, self-similarity, and connectivity. We compare pretrained
visual encoders that enable precomputation of visual embeddings to achieve a
throughput of tens of thousands of transitions per second at training time on a
commodity desktop computer, allowing agents to learn from millions of
trajectories of experience in a matter of hours. We propose multiple forms of
computationally efficient stochastic augmentation to enable the learned policy
to generalise beyond these precomputed embeddings, and demonstrate successful
deployment of the learned policy on the real robot without fine tuning, despite
environmental appearance differences at test time. The dataset and code
required to reproduce these results and apply the technique to other datasets
and robots is made publicly available at rl-navigation.github.io/deployable
CityLearn: Diverse Real-World Environments for Sample-Efficient Navigation Policy Learning
Visual navigation tasks in real-world environments often require both
self-motion and place recognition feedback. While deep reinforcement learning
has shown success in solving these perception and decision-making problems in
an end-to-end manner, these algorithms require large amounts of experience to
learn navigation policies from high-dimensional data, which is generally
impractical for real robots due to sample complexity. In this paper, we address
these problems with two main contributions. We first leverage place recognition
and deep learning techniques combined with goal destination feedback to
generate compact, bimodal image representations that can then be used to
effectively learn control policies from a small amount of experience. Second,
we present an interactive framework, CityLearn, that enables for the first time
training and deployment of navigation algorithms across city-sized, realistic
environments with extreme visual appearance changes. CityLearn features more
than 10 benchmark datasets, often used in visual place recognition and
autonomous driving research, including over 100 recorded traversals across 60
cities around the world. We evaluate our approach on two CityLearn
environments, training our navigation policy on a single traversal. Results
show our method can be over 2 orders of magnitude faster than when using raw
images, and can also generalize across extreme visual changes including day to
night and summer to winter transitions.Comment: Preprint version of article accepted to ICRA 202
Learning deployable navigation policies at kilometer scale from a single traversal
Model-free reinforcement learning has recently been shown to be effective at learning navigation policies from complex image input. However, these algorithms tend to require large amounts of interaction with the environment, which can be prohibitively costly to obtain on robots in the real world. We present an approach for efficiently learning goal-directed navigation policies on a mobile robot, from only a single coverage traversal of recorded data. The navigation agent learns an effective policy over a diverse action space in a large heterogeneous environment consisting of more than 2km of travel, through buildings and outdoor regions that collectively exhibit large variations in visual appearance, self-similarity, and connectivity. We compare pretrained visual encoders that enable precomputation of visual embeddings to achieve a throughput of tens of thousands of transitions per second at training time on a commodity desktop computer, allowing agents to learn from millions of trajectories of experience in a matter of hours. We propose multi- ple forms of computationally efficient stochastic augmentation to enable the learned policy to generalise beyond these precomputed embeddings, and demonstrate successful deployment of the learned policy on the real robot without fine tuning, despite environmental appearance differences at test time. The dataset and code required to reproduce these results and apply the technique to other datasets and robots is made publicly available at rl-navigation.github.io/deployable