64 research outputs found
Attention-based Curiosity-driven Exploration in Deep Reinforcement Learning
Reinforcement Learning enables to train an agent via interaction with the
environment. However, in the majority of real-world scenarios, the extrinsic
feedback is sparse or not sufficient, thus intrinsic reward formulations are
needed to successfully train the agent. This work investigates and extends the
paradigm of curiosity-driven exploration. First, a probabilistic approach is
taken to exploit the advantages of the attention mechanism, which is
successfully applied in other domains of Deep Learning. Combining them, we
propose new methods, such as AttA2C, an extension of the Actor-Critic
framework. Second, another curiosity-based approach - ICM - is extended. The
proposed model utilizes attention to emphasize features for the dynamic models
within ICM, moreover, we also modify the loss function, resulting in a new
curiosity formulation, which we call rational curiosity. The corresponding
implementation can be found at https://github.com/rpatrik96/AttA2C/.Comment: Submitted to ICASSP2020, 5 pages, 8 figures, 2 table
SEMI: Self-supervised Exploration via Multisensory Incongruity
Efficient exploration is a long-standing problem in reinforcement learning.
In this work, we introduce a self-supervised exploration policy by
incentivizing the agent to maximize multisensory incongruity, which can be
measured in two aspects: perception incongruity and action incongruity. The
former represents the uncertainty in multisensory fusion model, while the
latter represents the uncertainty in an agent's policy. Specifically, an
alignment predictor is trained to detect whether multiple sensory inputs are
aligned, the error of which is used to measure perception incongruity. The
policy takes the multisensory observations with sensory-wise dropout as input
and outputs actions for exploration. The variance of actions is further used to
measure action incongruity. Our formulation allows the agent to learn skills by
exploring in a self-supervised manner without any external rewards. Besides,
our method enables the agent to learn a compact multimodal representation from
hard examples, which further improves the sample efficiency of our policy
learning. We demonstrate the efficacy of this formulation across a variety of
benchmark environments including object manipulation and audio-visual games
FOCUS: Object-Centric World Models for Robotics Manipulation
Understanding the world in terms of objects and the possible interplays with
them is an important cognition ability, especially in robotics manipulation,
where many tasks require robot-object interactions. However, learning such a
structured world model, which specifically captures entities and relationships,
remains a challenging and underexplored problem. To address this, we propose
FOCUS, a model-based agent that learns an object-centric world model. Thanks to
a novel exploration bonus that stems from the object-centric representation,
FOCUS can be deployed on robotics manipulation tasks to explore object
interactions more easily. Evaluating our approach on manipulation tasks across
different settings, we show that object-centric world models allow the agent to
solve tasks more efficiently and enable consistent exploration of robot-object
interactions. Using a Franka Emika robot arm, we also showcase how FOCUS could
be adopted in real-world settings
- …