8,983 research outputs found
Measuring Catastrophic Forgetting in Neural Networks
Deep neural networks are used in many state-of-the-art systems for machine
perception. Once a network is trained to do a specific task, e.g., bird
classification, it cannot easily be trained to do new tasks, e.g.,
incrementally learning to recognize additional bird species or learning an
entirely different task such as flower recognition. When new tasks are added,
typical deep neural networks are prone to catastrophically forgetting previous
tasks. Networks that are capable of assimilating new information incrementally,
much like how humans form new memories over time, will be more efficient than
re-training the model from scratch each time a new task needs to be learned.
There have been multiple attempts to develop schemes that mitigate catastrophic
forgetting, but these methods have not been directly compared, the tests used
to evaluate them vary considerably, and these methods have only been evaluated
on small-scale problems (e.g., MNIST). In this paper, we introduce new metrics
and benchmarks for directly comparing five different mechanisms designed to
mitigate catastrophic forgetting in neural networks: regularization,
ensembling, rehearsal, dual-memory, and sparse-coding. Our experiments on
real-world images and sounds show that the mechanism(s) that are critical for
optimal performance vary based on the incremental training paradigm and type of
data being used, but they all demonstrate that the catastrophic forgetting
problem has yet to be solved.Comment: To appear in AAAI 201
Born to learn: The inspiration, progress, and future of evolved plastic artificial neural networks
Biological plastic neural networks are systems of extraordinary computational
capabilities shaped by evolution, development, and lifetime learning. The
interplay of these elements leads to the emergence of adaptive behavior and
intelligence. Inspired by such intricate natural phenomena, Evolved Plastic
Artificial Neural Networks (EPANNs) use simulated evolution in-silico to breed
plastic neural networks with a large variety of dynamics, architectures, and
plasticity rules: these artificial systems are composed of inputs, outputs, and
plastic components that change in response to experiences in an environment.
These systems may autonomously discover novel adaptive algorithms, and lead to
hypotheses on the emergence of biological adaptation. EPANNs have seen
considerable progress over the last two decades. Current scientific and
technological advances in artificial neural networks are now setting the
conditions for radically new approaches and results. In particular, the
limitations of hand-designed networks could be overcome by more flexible and
innovative solutions. This paper brings together a variety of inspiring ideas
that define the field of EPANNs. The main methods and results are reviewed.
Finally, new opportunities and developments are presented
iCaRL: Incremental Classifier and Representation Learning
A major open problem on the road to artificial intelligence is the
development of incrementally learning systems that learn about more and more
concepts over time from a stream of data. In this work, we introduce a new
training strategy, iCaRL, that allows learning in such a class-incremental way:
only the training data for a small number of classes has to be present at the
same time and new classes can be added progressively. iCaRL learns strong
classifiers and a data representation simultaneously. This distinguishes it
from earlier works that were fundamentally limited to fixed data
representations and therefore incompatible with deep learning architectures. We
show by experiments on CIFAR-100 and ImageNet ILSVRC 2012 data that iCaRL can
learn many classes incrementally over a long period of time where other
strategies quickly fail.Comment: Accepted paper at CVPR 201
POWERPLAY: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem
Most of computer science focuses on automatically solving given computational
problems. I focus on automatically inventing or discovering problems in a way
inspired by the playful behavior of animals and humans, to train a more and
more general problem solver from scratch in an unsupervised fashion. Consider
the infinite set of all computable descriptions of tasks with possibly
computable solutions. The novel algorithmic framework POWERPLAY (2011)
continually searches the space of possible pairs of new tasks and modifications
of the current problem solver, until it finds a more powerful problem solver
that provably solves all previously learned tasks plus the new one, while the
unmodified predecessor does not. Wow-effects are achieved by continually making
previously learned skills more efficient such that they require less time and
space. New skills may (partially) re-use previously learned skills. POWERPLAY's
search orders candidate pairs of tasks and solver modifications by their
conditional computational (time & space) complexity, given the stored
experience so far. The new task and its corresponding task-solving skill are
those first found and validated. The computational costs of validating new
tasks need not grow with task repertoire size. POWERPLAY's ongoing search for
novelty keeps breaking the generalization abilities of its present solver. This
is related to Goedel's sequence of increasingly powerful formal theories based
on adding formerly unprovable statements to the axioms without affecting
previously provable theorems. The continually increasing repertoire of problem
solving procedures can be exploited by a parallel search for solutions to
additional externally posed tasks. POWERPLAY may be viewed as a greedy but
practical implementation of basic principles of creativity. A first
experimental analysis can be found in separate papers [53,54].Comment: 21 pages, additional connections to previous work, references to
first experiments with POWERPLA
Lifelong Learning of Spatiotemporal Representations with Dual-Memory Recurrent Self-Organization
Artificial autonomous agents and robots interacting in complex environments
are required to continually acquire and fine-tune knowledge over sustained
periods of time. The ability to learn from continuous streams of information is
referred to as lifelong learning and represents a long-standing challenge for
neural network models due to catastrophic forgetting. Computational models of
lifelong learning typically alleviate catastrophic forgetting in experimental
scenarios with given datasets of static images and limited complexity, thereby
differing significantly from the conditions artificial agents are exposed to.
In more natural settings, sequential information may become progressively
available over time and access to previous experience may be restricted. In
this paper, we propose a dual-memory self-organizing architecture for lifelong
learning scenarios. The architecture comprises two growing recurrent networks
with the complementary tasks of learning object instances (episodic memory) and
categories (semantic memory). Both growing networks can expand in response to
novel sensory experience: the episodic memory learns fine-grained
spatiotemporal representations of object instances in an unsupervised fashion
while the semantic memory uses task-relevant signals to regulate structural
plasticity levels and develop more compact representations from episodic
experience. For the consolidation of knowledge in the absence of external
sensory input, the episodic memory periodically replays trajectories of neural
reactivations. We evaluate the proposed model on the CORe50 benchmark dataset
for continuous object recognition, showing that we significantly outperform
current methods of lifelong learning in three different incremental learning
scenario
- …