14 research outputs found
Diffusion-based neuromodulation can eliminate catastrophic forgetting in simple neural networks
A long-term goal of AI is to produce agents that can learn a diversity of
skills throughout their lifetimes and continuously improve those skills via
experience. A longstanding obstacle towards that goal is catastrophic
forgetting, which is when learning new information erases previously learned
information. Catastrophic forgetting occurs in artificial neural networks
(ANNs), which have fueled most recent advances in AI. A recent paper proposed
that catastrophic forgetting in ANNs can be reduced by promoting modularity,
which can limit forgetting by isolating task information to specific clusters
of nodes and connections (functional modules). While the prior work did show
that modular ANNs suffered less from catastrophic forgetting, it was not able
to produce ANNs that possessed task-specific functional modules, thereby
leaving the main theory regarding modularity and forgetting untested. We
introduce diffusion-based neuromodulation, which simulates the release of
diffusing, neuromodulatory chemicals within an ANN that can modulate (i.e. up
or down regulate) learning in a spatial region. On the simple diagnostic
problem from the prior work, diffusion-based neuromodulation 1) induces
task-specific learning in groups of nodes and connections (task-specific
localized learning), which 2) produces functional modules for each subtask, and
3) yields higher performance by eliminating catastrophic forgetting. Overall,
our results suggest that diffusion-based neuromodulation promotes task-specific
localized learning and functional modularity, which can help solve the
challenging, but important problem of catastrophic forgetting
Combating catastrophic forgetting with developmental compression
Generally intelligent agents exhibit successful behavior across problems in
several settings. Endemic in approaches to realize such intelligence in
machines is catastrophic forgetting: sequential learning corrupts knowledge
obtained earlier in the sequence, or tasks antagonistically compete for system
resources. Methods for obviating catastrophic forgetting have sought to
identify and preserve features of the system necessary to solve one problem
when learning to solve another, or to enforce modularity such that minimally
overlapping sub-functions contain task specific knowledge. While successful,
both approaches scale poorly because they require larger architectures as the
number of training instances grows, causing different parts of the system to
specialize for separate subsets of the data. Here we present a method for
addressing catastrophic forgetting called developmental compression. It
exploits the mild impacts of developmental mutations to lessen adverse changes
to previously-evolved capabilities and `compresses' specialized neural networks
into a generalized one. In the absence of domain knowledge, developmental
compression produces systems that avoid overt specialization, alleviating the
need to engineer a bespoke system for every task permutation and suggesting
better scalability than existing approaches. We validate this method on a robot
control problem and hope to extend this approach to other machine learning
domains in the future
Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents
Evolution strategies (ES) are a family of black-box optimization algorithms
able to train deep neural networks roughly as well as Q-learning and policy
gradient methods on challenging deep reinforcement learning (RL) problems, but
are much faster (e.g. hours vs. days) because they parallelize better. However,
many RL problems require directed exploration because they have reward
functions that are sparse or deceptive (i.e. contain local optima), and it is
unknown how to encourage such exploration with ES. Here we show that algorithms
that have been invented to promote directed exploration in small-scale evolved
neural networks via populations of exploring agents, specifically novelty
search (NS) and quality diversity (QD) algorithms, can be hybridized with ES to
improve its performance on sparse or deceptive deep RL tasks, while retaining
scalability. Our experiments confirm that the resultant new algorithms, NS-ES
and two QD algorithms, NSR-ES and NSRA-ES, avoid local optima encountered by ES
to achieve higher performance on Atari and simulated robots learning to walk
around a deceptive trap. This paper thus introduces a family of fast, scalable
algorithms for reinforcement learning that are capable of directed exploration.
It also adds this new family of exploration algorithms to the RL toolbox and
raises the interesting possibility that analogous algorithms with multiple
simultaneous paths of exploration might also combine well with existing RL
algorithms outside ES
Guiding Neuroevolution with Structural Objectives
The structure and performance of neural networks are intimately connected,
and by use of evolutionary algorithms, neural network structures optimally
adapted to a given task can be explored. Guiding such neuroevolution with
additional objectives related to network structure has been shown to improve
performance in some cases, especially when modular neural networks are
beneficial. However, apart from objectives aiming to make networks more
modular, such structural objectives have not been widely explored. We propose
two new structural objectives and test their ability to guide evolving neural
networks on two problems which can benefit from decomposition into subtasks.
The first structural objective guides evolution to align neural networks with a
user-recommended decomposition pattern. Intuitively, this should be a powerful
guiding target for problems where human users can easily identify a structure.
The second structural objective guides evolution towards a population with a
high diversity in decomposition patterns. This results in exploration of many
different ways to decompose a problem, allowing evolution to find good
decompositions faster. Tests on our target problems reveal that both methods
perform well on a problem with a very clear and decomposable structure.
However, on a problem where the optimal decomposition is less obvious, the
structural diversity objective is found to outcompete other structural
objectives -- and this technique can even increase performance on problems
without any decomposable structure at all