26 research outputs found
Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search
In principle, reinforcement learning and policy search methods can enable
robots to learn highly complex and general skills that may allow them to
function amid the complexity and diversity of the real world. However, training
a policy that generalizes well across a wide range of real-world conditions
requires far greater quantity and diversity of experience than is practical to
collect with a single robot. Fortunately, it is possible for multiple robots to
share their experience with one another, and thereby, learn a policy
collectively. In this work, we explore distributed and asynchronous policy
learning as a means to achieve generalization and improved training times on
challenging, real-world manipulation tasks. We propose a distributed and
asynchronous version of Guided Policy Search and use it to demonstrate
collective policy learning on a vision-based door opening task using four
robots. We show that it achieves better generalization, utilization, and
training times than the single robot alternative.Comment: Submitted to the IEEE International Conference on Robotics and
Automation 201
Bayesian policy selection using active inference
Learning to take actions based on observations is a core requirement for
artificial agents to be able to be successful and robust at their task.
Reinforcement Learning (RL) is a well-known technique for learning such
policies. However, current RL algorithms often have to deal with reward
shaping, have difficulties generalizing to other environments and are most
often sample inefficient. In this paper, we explore active inference and the
free energy principle, a normative theory from neuroscience that explains how
self-organizing biological systems operate by maintaining a model of the world
and casting action selection as an inference problem. We apply this concept to
a typical problem known to the RL community, the mountain car problem, and show
how active inference encompasses both RL and learning from demonstrations.Comment: ICLR 2019 Workshop on Structure & priors in reinforcement learnin