1,892 research outputs found
Automatic Curriculum Learning For Deep RL: A Short Survey
Automatic Curriculum Learning (ACL) has become a cornerstone of recent
successes in Deep Reinforcement Learning (DRL).These methods shape the learning
trajectories of agents by challenging them with tasks adapted to their
capacities. In recent years, they have been used to improve sample efficiency
and asymptotic performance, to organize exploration, to encourage
generalization or to solve sparse reward problems, among others. The ambition
of this work is dual: 1) to present a compact and accessible introduction to
the Automatic Curriculum Learning literature and 2) to draw a bigger picture of
the current state of the art in ACL to encourage the cross-breeding of existing
concepts and the emergence of new ideas.Comment: Accepted at IJCAI202
VPE: Variational Policy Embedding for Transfer Reinforcement Learning
Reinforcement Learning methods are capable of solving complex problems, but
resulting policies might perform poorly in environments that are even slightly
different. In robotics especially, training and deployment conditions often
vary and data collection is expensive, making retraining undesirable.
Simulation training allows for feasible training times, but on the other hand
suffers from a reality-gap when applied in real-world settings. This raises the
need of efficient adaptation of policies acting in new environments. We
consider this as a problem of transferring knowledge within a family of similar
Markov decision processes.
For this purpose we assume that Q-functions are generated by some
low-dimensional latent variable. Given such a Q-function, we can find a master
policy that can adapt given different values of this latent variable. Our
method learns both the generative mapping and an approximate posterior of the
latent variables, enabling identification of policies for new tasks by
searching only in the latent space, rather than the space of all policies. The
low-dimensional space, and master policy found by our method enables policies
to quickly adapt to new environments. We demonstrate the method on both a
pendulum swing-up task in simulation, and for simulation-to-real transfer on a
pushing task
Artificial Intelligence and Systems Theory: Applied to Cooperative Robots
This paper describes an approach to the design of a population of cooperative
robots based on concepts borrowed from Systems Theory and Artificial
Intelligence. The research has been developed under the SocRob project, carried
out by the Intelligent Systems Laboratory at the Institute for Systems and
Robotics - Instituto Superior Tecnico (ISR/IST) in Lisbon. The acronym of the
project stands both for "Society of Robots" and "Soccer Robots", the case study
where we are testing our population of robots. Designing soccer robots is a
very challenging problem, where the robots must act not only to shoot a ball
towards the goal, but also to detect and avoid static (walls, stopped robots)
and dynamic (moving robots) obstacles. Furthermore, they must cooperate to
defeat an opposing team. Our past and current research in soccer robotics
includes cooperative sensor fusion for world modeling, object recognition and
tracking, robot navigation, multi-robot distributed task planning and
coordination, including cooperative reinforcement learning in cooperative and
adversarial environments, and behavior-based architectures for real time task
execution of cooperating robot teams
InfoSwarms: Drone Swarms and Information Warfare
Drone swarms, which can be used at sea, on land, in the air, and even in space, are fundamentally information-dependent weapons. No study to date has examined drone swarms in the context of information warfare writ large. This article explores the dependence of these swarms on information and the resultant connections with areas of information warfare—electronic, cyber, space, and psychological—drawing on open-source research and qualitative reasoning. Overall, the article offers insights into how this important emerging technology fits into the broader defense ecosystem and outlines practical approaches to strengthening related information warfare capabilities
Joint Goal and Strategy Inference across Heterogeneous Demonstrators via Reward Network Distillation
Reinforcement learning (RL) has achieved tremendous success as a general
framework for learning how to make decisions. However, this success relies on
the interactive hand-tuning of a reward function by RL experts. On the other
hand, inverse reinforcement learning (IRL) seeks to learn a reward function
from readily-obtained human demonstrations. Yet, IRL suffers from two major
limitations: 1) reward ambiguity - there are an infinite number of possible
reward functions that could explain an expert's demonstration and 2)
heterogeneity - human experts adopt varying strategies and preferences, which
makes learning from multiple demonstrators difficult due to the common
assumption that demonstrators seeks to maximize the same reward. In this work,
we propose a method to jointly infer a task goal and humans' strategic
preferences via network distillation. This approach enables us to distill a
robust task reward (addressing reward ambiguity) and to model each strategy's
objective (handling heterogeneity). We demonstrate our algorithm can better
recover task reward and strategy rewards and imitate the strategies in two
simulated tasks and a real-world table tennis task.Comment: In Proceedings of the 2020 ACM/IEEE In-ternational Conference on
Human-Robot Interaction (HRI '20), March 23 to 26, 2020, Cambridge, United
Kingdom.ACM, New York, NY, USA, 10 page
- …