2 research outputs found
Recurrent Control Nets for Deep Reinforcement Learning
Central Pattern Generators (CPGs) are biological neural circuits capable of
producing coordinated rhythmic outputs in the absence of rhythmic input. As a
result, they are responsible for most rhythmic motion in living organisms. This
rhythmic control is broadly applicable to fields such as locomotive robotics
and medical devices. In this paper, we explore the possibility of creating a
self-sustaining CPG network for reinforcement learning that learns rhythmic
motion more efficiently and across more general environments than the current
multilayer perceptron (MLP) baseline models. Recent work introduces the
Structured Control Net (SCN), which maintains linear and nonlinear modules for
local and global control, respectively. Here, we show that time-sequence
architectures such as Recurrent Neural Networks (RNNs) model CPGs effectively.
Combining previous work with RNNs and SCNs, we introduce the Recurrent Control
Net (RCN), which adds a linear component to the, RCNs match and exceed the
performance of baseline MLPs and SCNs across all environment tasks. Our
findings confirm existing intuitions for RNNs on reinforcement learning tasks,
and demonstrate promise of SCN-like structures in reinforcement learning
A Quadratic Actor Network for Model-Free Reinforcement Learning
In this work we discuss the incorporation of quadratic neurons into policy
networks in the context of model-free actor-critic reinforcement learning.
Quadratic neurons admit an explicit quadratic function approximation in
contrast to conventional approaches where the the non-linearity is induced by
the activation functions. We perform empiric experiments on several MuJoCo
continuous control tasks and find that when quadratic neurons are added to MLP
policy networks those outperform the baseline MLP whilst admitting a smaller
number of parameters. The top returned reward is in average increased by
while being about more sample efficient. Moreover, it can
maintain its advantage against added action and observation noise.Comment: 8 pages, 15 figure