2 research outputs found
Genetic-Gated Networks for Deep Reinforcement
We introduce the Genetic-Gated Networks (G2Ns), simple neural networks that
combine a gate vector composed of binary genetic genes in the hidden layer(s)
of networks. Our method can take both advantages of gradient-free optimization
and gradient-based optimization methods, of which the former is effective for
problems with multiple local minima, while the latter can quickly find local
minima. In addition, multiple chromosomes can define different models, making
it easy to construct multiple models and can be effectively applied to problems
that require multiple models. We show that this G2N can be applied to typical
reinforcement learning algorithms to achieve a large improvement in sample
efficiency and performance
Multi-Path Policy Optimization
Recent years have witnessed a tremendous improvement of deep reinforcement
learning. However, a challenging problem is that an agent may suffer from
inefficient exploration, particularly for on-policy methods. Previous
exploration methods either rely on complex structure to estimate the novelty of
states, or incur sensitive hyper-parameters causing instability. We propose an
efficient exploration method, Multi-Path Policy Optimization (MPPO), which does
not incur high computation cost and ensures stability. MPPO maintains an
efficient mechanism that effectively utilizes a population of diverse policies
to enable better exploration, especially in sparse environments. We also give a
theoretical guarantee of the stable performance. We build our scheme upon two
widely-adopted on-policy methods, the Trust-Region Policy Optimization
algorithm and Proximal Policy Optimization algorithm. We conduct extensive
experiments on several MuJoCo tasks and their sparsified variants to fairly
evaluate the proposed method. Results show that MPPO significantly outperforms
state-of-the-art exploration methods in terms of both sample efficiency and
final performance.Comment: AAMAS-202