19 research outputs found
Automatic Curriculum Learning For Deep RL: A Short Survey
Automatic Curriculum Learning (ACL) has become a cornerstone of recent
successes in Deep Reinforcement Learning (DRL).These methods shape the learning
trajectories of agents by challenging them with tasks adapted to their
capacities. In recent years, they have been used to improve sample efficiency
and asymptotic performance, to organize exploration, to encourage
generalization or to solve sparse reward problems, among others. The ambition
of this work is dual: 1) to present a compact and accessible introduction to
the Automatic Curriculum Learning literature and 2) to draw a bigger picture of
the current state of the art in ACL to encourage the cross-breeding of existing
concepts and the emergence of new ideas.Comment: Accepted at IJCAI202
GRIMGEP: Learning Progress for Robust Goal Sampling in Visual Deep Reinforcement Learning
Designing agents, capable of learning autonomously a wide range of skills is
critical in order to increase the scope of reinforcement learning. It will both
increase the diversity of learned skills and reduce the burden of manually
designing reward functions for each skill. Self-supervised agents, setting
their own goals, and trying to maximize the diversity of those goals have shown
great promise towards this end. However, a currently known limitation of agents
trying to maximize the diversity of sampled goals is that they tend to get
attracted to noise or more generally to parts of the environments that cannot
be controlled (distractors). When agents have access to predefined goal
features or expert knowledge, absolute Learning Progress (ALP) provides a way
to distinguish between regions that can be controlled and those that cannot.
However, those methods often fall short when the agents are only provided with
raw sensory inputs such as images. In this work we extend those concepts to
unsupervised image-based goal exploration. We propose a framework that allows
agents to autonomously identify and ignore noisy distracting regions while
searching for novelty in the learnable regions to both improve overall
performance and avoid catastrophic forgetting. Our framework can be combined
with any state-of-the-art novelty seeking goal exploration approaches. We
construct a rich 3D image based environment with distractors. Experiments on
this environment show that agents using our framework successfully identify
interesting regions of the environment, resulting in drastically improved
performances. The source code is available at
https://sites.google.com/view/grimgep
Learning with AMIGo: Adversarially Motivated Intrinsic Goals
A key challenge for reinforcement learning (RL) consists of learning in
environments with sparse extrinsic rewards. In contrast to current RL methods,
humans are able to learn new skills with little or no reward by using various
forms of intrinsic motivation. We propose AMIGo, a novel agent incorporating --
as form of meta-learning -- a goal-generating teacher that proposes
Adversarially Motivated Intrinsic Goals to train a goal-conditioned "student"
policy in the absence of (or alongside) environment reward. Specifically,
through a simple but effective "constructively adversarial" objective, the
teacher learns to propose increasingly challenging -- yet achievable -- goals
that allow the student to learn general skills for acting in a new environment,
independent of the task to be solved. We show that our method generates a
natural curriculum of self-proposed goals which ultimately allows the agent to
solve challenging procedurally-generated tasks where other forms of intrinsic
motivation and state-of-the-art RL methods fail.Comment: 18 pages, 6 figures, published at The Ninth International Conference
on Learning Representations (2021
ELSIM: End-to-end learning of reusable skills through intrinsic motivation
Taking inspiration from developmental learning, we present a novel
reinforcement learning architecture which hierarchically learns and represents
self-generated skills in an end-to-end way. With this architecture, an agent
focuses only on task-rewarded skills while keeping the learning process of
skills bottom-up. This bottom-up approach allows to learn skills that 1- are
transferable across tasks, 2- improves exploration when rewards are sparse. To
do so, we combine a previously defined mutual information objective with a
novel curriculum learning algorithm, creating an unlimited and explorable tree
of skills. We test our agent on simple gridworld environments to understand and
visualize how the agent distinguishes between its skills. Then we show that our
approach can scale on more difficult MuJoCo environments in which our agent is
able to build a representation of skills which improve over a baseline both
transfer learning and exploration when rewards are sparse.Comment: Accepted at ECML 202
Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills
Acquiring abilities in the absence of a task-oriented reward function is at
the frontier of reinforcement learning research. This problem has been studied
through the lens of empowerment, which draws a connection between option
discovery and information theory. Information-theoretic skill discovery methods
have garnered much interest from the community, but little research has been
conducted in understanding their limitations. Through theoretical analysis and
empirical evidence, we show that existing algorithms suffer from a common
limitation -- they discover options that provide a poor coverage of the state
space. In light of this, we propose 'Explore, Discover and Learn' (EDL), an
alternative approach to information-theoretic skill discovery. Crucially, EDL
optimizes the same information-theoretic objective derived from the empowerment
literature, but addresses the optimization problem using different machinery.
We perform an extensive evaluation of skill discovery methods on controlled
environments and show that EDL offers significant advantages, such as
overcoming the coverage problem, reducing the dependence of learned skills on
the initial state, and allowing the user to define a prior over which behaviors
should be learned. Code is publicly available at
https://github.com/victorcampos7/edl.Comment: 17 pages, 11 figures. Code is publicly available at
https://github.com/victorcampos7/ed
Variational Curriculum Reinforcement Learning for Unsupervised Discovery of Skills
Mutual information-based reinforcement learning (RL) has been proposed as a
promising framework for retrieving complex skills autonomously without a
task-oriented reward function through mutual information (MI) maximization or
variational empowerment. However, learning complex skills is still challenging,
due to the fact that the order of training skills can largely affect sample
efficiency. Inspired by this, we recast variational empowerment as curriculum
learning in goal-conditioned RL with an intrinsic reward function, which we
name Variational Curriculum RL (VCRL). From this perspective, we propose a
novel approach to unsupervised skill discovery based on information theory,
called Value Uncertainty Variational Curriculum (VUVC). We prove that, under
regularity conditions, VUVC accelerates the increase of entropy in the visited
states compared to the uniform curriculum. We validate the effectiveness of our
approach on complex navigation and robotic manipulation tasks in terms of
sample efficiency and state coverage speed. We also demonstrate that the skills
discovered by our method successfully complete a real-world robot navigation
task in a zero-shot setup and that incorporating these skills with a global
planner further increases the performance.Comment: ICML 2023. First two authors contributed equally. Code at
https://github.com/seongun-kim/vcr
Explainable Artificial Intelligence (xAI) Approaches and Deep Meta-Learning Models
The explainable artificial intelligence (xAI) is one of the interesting issues that has emerged recently. Many researchers are trying to deal with the subject with different dimensions and interesting results that have come out. However, we are still at the beginning of the way to understand these types of models. The forthcoming years are expected to be years in which the openness of deep learning models is discussed. In classical artificial intelligence approaches, we frequently encounter deep learning methods available today. These deep learning methods can yield highly effective results according to the data set size, data set quality, the methods used in feature extraction, the hyper parameter set used in deep learning models, the activation functions, and the optimization algorithms. However, there are important shortcomings that current deep learning models are currently inadequate. These artificial neural network-based models are black box models that generalize the data transmitted to it and learn from the data. Therefore, the relational link between input and output is not observable. This is an important open point in artificial neural networks and deep learning models. For these reasons, it is necessary to make serious efforts on the explainability and interpretability of black box models