87 research outputs found
Measuring Catastrophic Forgetting in Neural Networks
Deep neural networks are used in many state-of-the-art systems for machine
perception. Once a network is trained to do a specific task, e.g., bird
classification, it cannot easily be trained to do new tasks, e.g.,
incrementally learning to recognize additional bird species or learning an
entirely different task such as flower recognition. When new tasks are added,
typical deep neural networks are prone to catastrophically forgetting previous
tasks. Networks that are capable of assimilating new information incrementally,
much like how humans form new memories over time, will be more efficient than
re-training the model from scratch each time a new task needs to be learned.
There have been multiple attempts to develop schemes that mitigate catastrophic
forgetting, but these methods have not been directly compared, the tests used
to evaluate them vary considerably, and these methods have only been evaluated
on small-scale problems (e.g., MNIST). In this paper, we introduce new metrics
and benchmarks for directly comparing five different mechanisms designed to
mitigate catastrophic forgetting in neural networks: regularization,
ensembling, rehearsal, dual-memory, and sparse-coding. Our experiments on
real-world images and sounds show that the mechanism(s) that are critical for
optimal performance vary based on the incremental training paradigm and type of
data being used, but they all demonstrate that the catastrophic forgetting
problem has yet to be solved.Comment: To appear in AAAI 201
Evaluation of Regularization-based Continual Learning Approaches: Application to HAR
Pervasive computing allows the provision of services in many important areas,
including the relevant and dynamic field of health and well-being. In this
domain, Human Activity Recognition (HAR) has gained a lot of attention in
recent years. Current solutions rely on Machine Learning (ML) models and
achieve impressive results. However, the evolution of these models remains
difficult, as long as a complete retraining is not performed. To overcome this
problem, the concept of Continual Learning is very promising today and, more
particularly, the techniques based on regularization. These techniques are
particularly interesting for their simplicity and their low cost. Initial
studies have been conducted and have shown promising outcomes. However, they
remain very specific and difficult to compare. In this paper, we provide a
comprehensive comparison of three regularization-based methods that we adapted
to the HAR domain, highlighting their strengths and limitations. Our
experiments were conducted on the UCI HAR dataset and the results showed that
no single technique outperformed all others in all scenarios considered
Explaining How Deep Neural Networks Forget by Deep Visualization
Explaining the behaviors of deep neural networks, usually considered as black
boxes, is critical especially when they are now being adopted over diverse
aspects of human life. Taking the advantages of interpretable machine learning
(interpretable ML), this paper proposes a novel tool called Catastrophic
Forgetting Dissector (or CFD) to explain catastrophic forgetting in continual
learning settings. We also introduce a new method called Critical Freezing
based on the observations of our tool. Experiments on ResNet articulate how
catastrophic forgetting happens, particularly showing which components of this
famous network are forgetting. Our new continual learning algorithm defeats
various recent techniques by a significant margin, proving the capability of
the investigation. Critical freezing not only attacks catastrophic forgetting
but also exposes explainability.Comment: 12 pages, 4 figures, 1 table. arXiv admin note: substantial text
overlap with arXiv:2001.0157
FastICARL: Fast incremental classifier and representation learning with efficient budget allocation in audio sensing applications
Various incremental learning (IL) approaches have been proposed to help deep
learning models learn new tasks/classes continuously without forgetting what
was learned previously (i.e., avoid catastrophic forgetting). With the growing
number of deployed audio sensing applications that need to dynamically
incorporate new tasks and changing input distribution from users, the ability
of IL on-device becomes essential for both efficiency and user privacy.
However, prior works suffer from high computational costs and storage demands
which hinders the deployment of IL on-device. In this work, to overcome these
limitations, we develop an end-to-end and on-device IL framework, FastICARL,
that incorporates an exemplar-based IL and quantization in the context of
audio-based applications. We first employ k-nearest-neighbor to reduce the
latency of IL. Then, we jointly utilize a quantization technique to decrease
the storage requirements of IL. We implement FastICARL on two types of mobile
devices and demonstrate that FastICARL remarkably decreases the IL time up to
78-92% and the storage requirements by 2-4 times without sacrificing its
performance. FastICARL enables complete on-device IL, ensuring user privacy as
the user data does not need to leave the device
- …