812 research outputs found
Continual learning from stationary and non-stationary data
Continual learning aims at developing models that are capable of working on constantly evolving problems over a long-time horizon. In such environments, we can distinguish three essential aspects of training and maintaining machine learning models - incorporating new knowledge, retaining it and reacting to changes. Each of them poses its own challenges, constituting a compound problem with multiple goals.
Remembering previously incorporated concepts is the main property of a model that is required when dealing with stationary distributions. In non-stationary environments, models should be capable of selectively forgetting outdated decision boundaries and adapting to new concepts. Finally, a significant difficulty can be found in combining these two abilities within a single learning algorithm, since, in such scenarios, we have to balance remembering and forgetting instead of focusing only on one aspect.
The presented dissertation addressed these problems in an exploratory way. Its main goal was to grasp the continual learning paradigm as a whole, analyze its different branches and tackle identified issues covering various aspects of learning from sequentially incoming data. By doing so, this work not only filled several gaps in the current continual learning research but also emphasized the complexity and diversity of challenges existing in this domain. Comprehensive experiments conducted for all of the presented contributions have demonstrated their effectiveness and substantiated the validity of the stated claims
Towards Efficient Lifelong Machine Learning in Deep Neural Networks
Humans continually learn and adapt to new knowledge and environments throughout their lifetimes. Rarely does learning new information cause humans to catastrophically forget previous knowledge. While deep neural networks (DNNs) now rival human performance on several supervised machine perception tasks, when updated on changing data distributions, they catastrophically forget previous knowledge. Enabling DNNs to learn new information over time opens the door for new applications such as self-driving cars that adapt to seasonal changes or smartphones that adapt to changing user preferences. In this dissertation, we propose new methods and experimental paradigms for efficiently training continual DNNs without forgetting. We then apply these methods to several visual and multi-modal perception tasks including image classification, visual question answering, analogical reasoning, and attribute and relationship prediction in visual scenes
A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts
Machine learning methods strive to acquire a robust model during training
that can generalize well to test samples, even under distribution shifts.
However, these methods often suffer from a performance drop due to unknown test
distributions. Test-time adaptation (TTA), an emerging paradigm, has the
potential to adapt a pre-trained model to unlabeled data during testing, before
making predictions. Recent progress in this paradigm highlights the significant
benefits of utilizing unlabeled data for training self-adapted models prior to
inference. In this survey, we divide TTA into several distinct categories,
namely, test-time (source-free) domain adaptation, test-time batch adaptation,
online test-time adaptation, and test-time prior adaptation. For each category,
we provide a comprehensive taxonomy of advanced algorithms, followed by a
discussion of different learning scenarios. Furthermore, we analyze relevant
applications of TTA and discuss open challenges and promising areas for future
research. A comprehensive list of TTA methods can be found at
\url{https://github.com/tim-learn/awesome-test-time-adaptation}.Comment: Discussions, comments, and questions are all welcomed in
\url{https://github.com/tim-learn/awesome-test-time-adaptation
A survey on learning from imbalanced data streams: taxonomy, challenges, empirical study, and reproducible experimental framework
Class imbalance poses new challenges when it comes to classifying data
streams. Many algorithms recently proposed in the literature tackle this
problem using a variety of data-level, algorithm-level, and ensemble
approaches. However, there is a lack of standardized and agreed-upon procedures
on how to evaluate these algorithms. This work presents a taxonomy of
algorithms for imbalanced data streams and proposes a standardized, exhaustive,
and informative experimental testbed to evaluate algorithms in a collection of
diverse and challenging imbalanced data stream scenarios. The experimental
study evaluates 24 state-of-the-art data streams algorithms on 515 imbalanced
data streams that combine static and dynamic class imbalance ratios,
instance-level difficulties, concept drift, real-world and semi-synthetic
datasets in binary and multi-class scenarios. This leads to the largest
experimental study conducted so far in the data stream mining domain. We
discuss the advantages and disadvantages of state-of-the-art classifiers in
each of these scenarios and we provide general recommendations to end-users for
selecting the best algorithms for imbalanced data streams. Additionally, we
formulate open challenges and future directions for this domain. Our
experimental testbed is fully reproducible and easy to extend with new methods.
This way we propose the first standardized approach to conducting experiments
in imbalanced data streams that can be used by other researchers to create
trustworthy and fair evaluation of newly proposed methods. Our experimental
framework can be downloaded from
https://github.com/canoalberto/imbalanced-streams
Robust Deep Learning in the Open World with Lifelong Learning and Representation Learning
Deep neural networks have shown a superior performance in many learning problems by learning hierarchical latent representations from a large amount of labeled data. However, the success of deep learning methods is under the closed-world assumption: no instances of new classes appear at test time. On the contrary, our world is open and dynamic, such that the closed-world assumption may not hold in many real applications. In other words, deep learning-based agents are not guaranteed to work in the open world, where instances of unknown and unseen classes are pervasive.
In this dissertation, we explore lifelong learning and representation learning to generalize deep learning methods to the open world. Lifelong learning involves identifying novel classes and incrementally learning them without training from scratch, and representation learning involves being robust to data distribution shifts. Specifically, we propose 1) hierarchical novelty detection for detecting and identifying novel classes, 2) continual learning with unlabeled data to overcome catastrophic forgetting when learning the novel classes, 3) network randomization for learning robust representations across visual domain shifts, and 4) domain-agnostic contrastive representation learning, which is robust to data distribution shifts.
The first part of this dissertation studies a cycle of lifelong learning. We divide it into two steps and present how we can achieve each step: first, we propose a new novelty detection and classification framework termed hierarchical novelty detection for detecting and identifying novel classes. Then, we show that unlabeled data easily obtainable in the open world are useful to avoid forgetting about the previously learned classes when learning novel classes. We propose a new knowledge distillation method and confidence-based sampling method to effectively leverage the unlabeled data.
The second part of this dissertation studies robust representation learning: first, we present a network randomization method to learn an invariant representation across visual changes, particularly effective in deep reinforcement learning. Then, we propose a domain-agnostic robust representation learning method by introducing vicinal risk minimization in contrastive representation learning, which consistently improves the quality of representation and transferability across data distribution shifts.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162981/1/kibok_1.pd
- …