2,123 research outputs found
Sparse Graphical Memory for Robust Planning
To operate effectively in the real world, agents should be able to act from
high-dimensional raw sensory input such as images and achieve diverse goals
across long time-horizons. Current deep reinforcement and imitation learning
methods can learn directly from high-dimensional inputs but do not scale well
to long-horizon tasks. In contrast, classical graphical methods like A* search
are able to solve long-horizon tasks, but assume that the state space is
abstracted away from raw sensory input. Recent works have attempted to combine
the strengths of deep learning and classical planning; however, dominant
methods in this domain are still quite brittle and scale poorly with the size
of the environment. We introduce Sparse Graphical Memory (SGM), a new data
structure that stores states and feasible transitions in a sparse memory. SGM
aggregates states according to a novel two-way consistency objective, adapting
classic state aggregation criteria to goal-conditioned RL: two states are
redundant when they are interchangeable both as goals and as starting states.
Theoretically, we prove that merging nodes according to two-way consistency
leads to an increase in shortest path lengths that scales only linearly with
the merging threshold. Experimentally, we show that SGM significantly
outperforms current state of the art methods on long horizon, sparse-reward
visual navigation tasks. Project video and code are available at
https://mishalaskin.github.io/sgm/Comment: Accepted at NeurIPS 2020. Video and code at
https://mishalaskin.github.io/sgm
Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning
Learning is an inherently continuous phenomenon. When humans learn a new task
there is no explicit distinction between training and inference. As we learn a
task, we keep learning about it while performing the task. What we learn and
how we learn it varies during different stages of learning. Learning how to
learn and adapt is a key property that enables us to generalize effortlessly to
new settings. This is in contrast with conventional settings in machine
learning where a trained model is frozen during inference. In this paper we
study the problem of learning to learn at both training and test time in the
context of visual navigation. A fundamental challenge in navigation is
generalization to unseen scenes. In this paper we propose a self-adaptive
visual navigation method (SAVN) which learns to adapt to new environments
without any explicit supervision. Our solution is a meta-reinforcement learning
approach where an agent learns a self-supervised interaction loss that
encourages effective navigation. Our experiments, performed in the AI2-THOR
framework, show major improvements in both success rate and SPL for visual
navigation in novel scenes. Our code and data are available at:
https://github.com/allenai/savn
A Behavioral Approach to Visual Navigation with Graph Localization Networks
Inspired by research in psychology, we introduce a behavioral approach for
visual navigation using topological maps. Our goal is to enable a robot to
navigate from one location to another, relying only on its visual input and the
topological map of the environment. We propose using graph neural networks for
localizing the agent in the map, and decompose the action space into primitive
behaviors implemented as convolutional or recurrent neural networks. Using the
Gibson simulator, we verify that our approach outperforms relevant baselines
and is able to navigate in both seen and unseen environments.Comment: Video: https://youtu.be/nN3B1F90CF
Unsupervised Emergence of Egocentric Spatial Structure from Sensorimotor Prediction
Despite its omnipresence in robotics application, the nature of spatial knowledgeand the mechanisms that underlie its emergence in autonomous agents are stillpoorly understood. Recent theoretical works suggest that the Euclidean structure ofspace induces invariants in an agent’s raw sensorimotor experience. We hypothesizethat capturing these invariants is beneficial for sensorimotor prediction and that,under certain exploratory conditions, a motor representation capturing the structureof the external space should emerge as a byproduct of learning to predict futuresensory experiences. We propose a simple sensorimotor predictive scheme, applyit to different agents and types of exploration, and evaluate the pertinence of thesehypotheses. We show that a naive agent can capture the topology and metricregularity of its sensor’s position in an egocentric spatial frame without any a prioriknowledge, nor extraneous supervision
Graph-based State Representation for Deep Reinforcement Learning
Deep RL approaches build much of their success on the ability of the deep
neural network to generate useful internal representations. Nevertheless, they
suffer from a high sample-complexity and starting with a good input
representation can have a significant impact on the performance. In this paper,
we exploit the fact that the underlying Markov decision process (MDP)
represents a graph, which enables us to incorporate the topological information
for effective state representation learning.
Motivated by the recent success of node representations for several graph
analytical tasks we specifically investigate the capability of node
representation learning methods to effectively encode the topology of the
underlying MDP in Deep RL. To this end we perform a comparative analysis of
several models chosen from 4 different classes of representation learning
algorithms for policy learning in grid-world navigation tasks, which are
representative of a large class of RL problems. We find that all embedding
methods outperform the commonly used matrix representation of grid-world
environments in all of the studied cases. Moreoever, graph convolution based
methods are outperformed by simpler random walk based methods and graph linear
autoencoders
Plan2Vec: Unsupervised Representation Learning by Latent Plans
In this paper we introduce plan2vec, an unsupervised representation learning
approach that is inspired by reinforcement learning. Plan2vec constructs a
weighted graph on an image dataset using near-neighbor distances, and then
extrapolates this local metric to a global embedding by distilling
path-integral over planned path. When applied to control, plan2vec offers a way
to learn goal-conditioned value estimates that are accurate over long horizons
that is both compute and sample efficient. We demonstrate the effectiveness of
plan2vec on one simulated and two challenging real-world image datasets.
Experimental results show that plan2vec successfully amortizes the planning
cost, enabling reactive planning that is linear in memory and computation
complexity rather than exhaustive over the entire state space.Comment: code available at https://geyang.github.io/plan2ve
Self-Organizing Maps as a Storage and Transfer Mechanism in Reinforcement Learning
The idea of reusing information from previously learned tasks (source tasks)
for the learning of new tasks (target tasks) has the potential to significantly
improve the sample efficiency reinforcement learning agents. In this work, we
describe an approach to concisely store and represent learned task knowledge,
and reuse it by allowing it to guide the exploration of an agent while it
learns new tasks. In order to do so, we use a measure of similarity that is
defined directly in the space of parameterized representations of the value
functions. This similarity measure is also used as a basis for a variant of the
growing self-organizing map algorithm, which is simultaneously used to enable
the storage of previously acquired task knowledge in an adaptive and scalable
manner.We empirically validate our approach in a simulated navigation
environment and discuss possible extensions to this approach along with
potential applications where it could be particularly useful.Comment: 7 pages, 7 figures, presented at ALA Workshop, FAIM, Stockholm, 201
How to reduce computation time while sparing performance during robot navigation? A neuro-inspired architecture for autonomous shifting between model-based and model-free learning
Taking inspiration from how the brain coordinates multiple learning systems
is an appealing strategy to endow robots with more flexibility. One of the
expected advantages would be for robots to autonomously switch to the least
costly system when its performance is satisfying. However, to our knowledge no
study on a real robot has yet shown that the measured computational cost is
reduced while performance is maintained with such brain-inspired algorithms. We
present navigation experiments involving paths of different lengths to the
goal, dead-end, and non-stationarity (i.e., change in goal location and
apparition of obstacles). We present a novel arbitration mechanism between
learning systems that explicitly measures performance and cost. We find that
the robot can adapt to environment changes by switching between learning
systems so as to maintain a high performance. Moreover, when the task is
stable, the robot also autonomously shifts to the least costly system, which
leads to a drastic reduction in computation cost while keeping a high
performance. Overall, these results illustrates the interest of using multiple
learning systems.Comment: 12 pages, 4 figures ; Living Machines 202
Unsupervised Emergence of Spatial Structure from Sensorimotor Prediction
Despite its omnipresence in robotics application, the nature of spatial
knowledge and the mechanisms that underlie its emergence in autonomous agents
are still poorly understood. Recent theoretical work suggests that the concept
of space can be grounded by capturing invariants induced by the structure of
space in an agent's raw sensorimotor experience. Moreover, it is hypothesized
that capturing these invariants is beneficial for a naive agent trying to
predict its sensorimotor experience. Under certain exploratory conditions,
spatial representations should thus emerge as a byproduct of learning to
predict. We propose a simple sensorimotor predictive scheme, apply it to
different agents and types of exploration, and evaluate the pertinence of this
hypothesis. We show that a naive agent can capture the topology and metric
regularity of its spatial configuration without any a priori knowledge, nor
extraneous supervision.Comment: 16 pages, 6 figure
IRLAS: Inverse Reinforcement Learning for Architecture Search
In this paper, we propose an inverse reinforcement learning method for
architecture search (IRLAS), which trains an agent to learn to search network
structures that are topologically inspired by human-designed network. Most
existing architecture search approaches totally neglect the topological
characteristics of architectures, which results in complicated architecture
with a high inference latency. Motivated by the fact that human-designed
networks are elegant in topology with a fast inference speed, we propose a
mirror stimuli function inspired by biological cognition theory to extract the
abstract topological knowledge of an expert human-design network (ResNeXt). To
avoid raising a too strong prior over the search space, we introduce inverse
reinforcement learning to train the mirror stimuli function and exploit it as a
heuristic guidance for architecture search, easily generalized to different
architecture search algorithms. On CIFAR-10, the best architecture searched by
our proposed IRLAS achieves 2.60% error rate. For ImageNet mobile setting, our
model achieves a state-of-the-art top-1 accuracy 75.28%, while being 2~4x
faster than most auto-generated architectures. A fast version of this model
achieves 10% faster than MobileNetV2, while maintaining a higher accuracy
- …