134,332 research outputs found
Progressive Learning without Forgetting
Learning from changing tasks and sequential experience without forgetting the
obtained knowledge is a challenging problem for artificial neural networks. In
this work, we focus on two challenging problems in the paradigm of Continual
Learning (CL) without involving any old data: (i) the accumulation of
catastrophic forgetting caused by the gradually fading knowledge space from
which the model learns the previous knowledge; (ii) the uncontrolled tug-of-war
dynamics to balance the stability and plasticity during the learning of new
tasks. In order to tackle these problems, we present Progressive Learning
without Forgetting (PLwF) and a credit assignment regime in the optimizer. PLwF
densely introduces model functions from previous tasks to construct a knowledge
space such that it contains the most reliable knowledge on each task and the
distribution information of different tasks, while credit assignment controls
the tug-of-war dynamics by removing gradient conflict through projection.
Extensive ablative experiments demonstrate the effectiveness of PLwF and credit
assignment. In comparison with other CL methods, we report notably better
results even without relying on any raw data
Incremental Learning of Object Detectors without Catastrophic Forgetting
Despite their success for object detection, convolutional neural networks are
ill-equipped for incremental learning, i.e., adapting the original model
trained on a set of classes to additionally detect objects of new classes, in
the absence of the initial training data. They suffer from "catastrophic
forgetting" - an abrupt degradation of performance on the original set of
classes, when the training objective is adapted to the new classes. We present
a method to address this issue, and learn object detectors incrementally, when
neither the original training data nor annotations for the original classes in
the new training set are available. The core of our proposed solution is a loss
function to balance the interplay between predictions on the new classes and a
new distillation loss which minimizes the discrepancy between responses for old
classes from the original and the updated networks. This incremental learning
can be performed multiple times, for a new set of classes in each step, with a
moderate drop in performance compared to the baseline network trained on the
ensemble of data. We present object detection results on the PASCAL VOC 2007
and COCO datasets, along with a detailed empirical analysis of the approach.Comment: To appear in ICCV 201
Goldilocks Forgetting in Cross-Situational Learning
Given that there is referential uncertainty (noise) when learning words, to what extent can forgetting filter some of that noise out, and be an aid to learning? Using a Cross Situational Learning model we find a U-shaped function of errors indicative of a "Goldilocks" zone of forgetting: an optimum store-loss ratio that is neither too aggressive nor too weak, but just the right amount to produce better learning outcomes. Forgetting acts as a high-pass filter that actively deletes (part of) the referential ambiguity noise, retains intended referents, and effectively amplifies the signal. The model achieves this performance without incorporating any specific cognitive biases of the type proposed in the constraints and principles account, and without any prescribed developmental changes in the underlying learning mechanism. Instead we interpret the model performance as more of a by-product of exposure to input, where the associative strengths in the lexicon grow as a function of linguistic experience in combination with memory limitations. The result adds a mechanistic explanation for the experimental evidence on spaced learning and, more generally, advocates integrating domain-general aspects of cognition, such as memory, into the language acquisition process
Lifelong Neural Predictive Coding: Learning Cumulatively Online without Forgetting
In lifelong learning systems, especially those based on artificial neural
networks, one of the biggest obstacles is the severe inability to retain old
knowledge as new information is encountered. This phenomenon is known as
catastrophic forgetting. In this article, we propose a new kind of
connectionist architecture, the Sequential Neural Coding Network, that is
robust to forgetting when learning from streams of data points and, unlike
networks of today, does not learn via the immensely popular back-propagation of
errors. Grounded in the neurocognitive theory of predictive processing, our
model adapts its synapses in a biologically-plausible fashion, while another,
complementary neural system rapidly learns to direct and control this
cortex-like structure by mimicking the task-executive control functionality of
the basal ganglia. In our experiments, we demonstrate that our self-organizing
system experiences significantly less forgetting as compared to standard neural
models and outperforms a wide swath of previously proposed methods even though
it is trained across task datasets in a stream-like fashion. The promising
performance of our complementary system on benchmarks, e.g., SplitMNIST, Split
Fashion MNIST, and Split NotMNIST, offers evidence that by incorporating
mechanisms prominent in real neuronal systems, such as competition, sparse
activation patterns, and iterative input processing, a new possibility for
tackling the grand challenge of lifelong machine learning opens up.Comment: Key updates including results on standard benchmarks, e.g., split
mnist/fmnist/not-mnist. Task selection/basal ganglia model has been
integrate
- …