812 research outputs found

    Continual learning from stationary and non-stationary data

    Get PDF
    Continual learning aims at developing models that are capable of working on constantly evolving problems over a long-time horizon. In such environments, we can distinguish three essential aspects of training and maintaining machine learning models - incorporating new knowledge, retaining it and reacting to changes. Each of them poses its own challenges, constituting a compound problem with multiple goals. Remembering previously incorporated concepts is the main property of a model that is required when dealing with stationary distributions. In non-stationary environments, models should be capable of selectively forgetting outdated decision boundaries and adapting to new concepts. Finally, a significant difficulty can be found in combining these two abilities within a single learning algorithm, since, in such scenarios, we have to balance remembering and forgetting instead of focusing only on one aspect. The presented dissertation addressed these problems in an exploratory way. Its main goal was to grasp the continual learning paradigm as a whole, analyze its different branches and tackle identified issues covering various aspects of learning from sequentially incoming data. By doing so, this work not only filled several gaps in the current continual learning research but also emphasized the complexity and diversity of challenges existing in this domain. Comprehensive experiments conducted for all of the presented contributions have demonstrated their effectiveness and substantiated the validity of the stated claims

    Towards Efficient Lifelong Machine Learning in Deep Neural Networks

    Get PDF
    Humans continually learn and adapt to new knowledge and environments throughout their lifetimes. Rarely does learning new information cause humans to catastrophically forget previous knowledge. While deep neural networks (DNNs) now rival human performance on several supervised machine perception tasks, when updated on changing data distributions, they catastrophically forget previous knowledge. Enabling DNNs to learn new information over time opens the door for new applications such as self-driving cars that adapt to seasonal changes or smartphones that adapt to changing user preferences. In this dissertation, we propose new methods and experimental paradigms for efficiently training continual DNNs without forgetting. We then apply these methods to several visual and multi-modal perception tasks including image classification, visual question answering, analogical reasoning, and attribute and relationship prediction in visual scenes

    A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts

    Full text link
    Machine learning methods strive to acquire a robust model during training that can generalize well to test samples, even under distribution shifts. However, these methods often suffer from a performance drop due to unknown test distributions. Test-time adaptation (TTA), an emerging paradigm, has the potential to adapt a pre-trained model to unlabeled data during testing, before making predictions. Recent progress in this paradigm highlights the significant benefits of utilizing unlabeled data for training self-adapted models prior to inference. In this survey, we divide TTA into several distinct categories, namely, test-time (source-free) domain adaptation, test-time batch adaptation, online test-time adaptation, and test-time prior adaptation. For each category, we provide a comprehensive taxonomy of advanced algorithms, followed by a discussion of different learning scenarios. Furthermore, we analyze relevant applications of TTA and discuss open challenges and promising areas for future research. A comprehensive list of TTA methods can be found at \url{https://github.com/tim-learn/awesome-test-time-adaptation}.Comment: Discussions, comments, and questions are all welcomed in \url{https://github.com/tim-learn/awesome-test-time-adaptation

    A survey on learning from imbalanced data streams: taxonomy, challenges, empirical study, and reproducible experimental framework

    Full text link
    Class imbalance poses new challenges when it comes to classifying data streams. Many algorithms recently proposed in the literature tackle this problem using a variety of data-level, algorithm-level, and ensemble approaches. However, there is a lack of standardized and agreed-upon procedures on how to evaluate these algorithms. This work presents a taxonomy of algorithms for imbalanced data streams and proposes a standardized, exhaustive, and informative experimental testbed to evaluate algorithms in a collection of diverse and challenging imbalanced data stream scenarios. The experimental study evaluates 24 state-of-the-art data streams algorithms on 515 imbalanced data streams that combine static and dynamic class imbalance ratios, instance-level difficulties, concept drift, real-world and semi-synthetic datasets in binary and multi-class scenarios. This leads to the largest experimental study conducted so far in the data stream mining domain. We discuss the advantages and disadvantages of state-of-the-art classifiers in each of these scenarios and we provide general recommendations to end-users for selecting the best algorithms for imbalanced data streams. Additionally, we formulate open challenges and future directions for this domain. Our experimental testbed is fully reproducible and easy to extend with new methods. This way we propose the first standardized approach to conducting experiments in imbalanced data streams that can be used by other researchers to create trustworthy and fair evaluation of newly proposed methods. Our experimental framework can be downloaded from https://github.com/canoalberto/imbalanced-streams

    Robust Deep Learning in the Open World with Lifelong Learning and Representation Learning

    Full text link
    Deep neural networks have shown a superior performance in many learning problems by learning hierarchical latent representations from a large amount of labeled data. However, the success of deep learning methods is under the closed-world assumption: no instances of new classes appear at test time. On the contrary, our world is open and dynamic, such that the closed-world assumption may not hold in many real applications. In other words, deep learning-based agents are not guaranteed to work in the open world, where instances of unknown and unseen classes are pervasive. In this dissertation, we explore lifelong learning and representation learning to generalize deep learning methods to the open world. Lifelong learning involves identifying novel classes and incrementally learning them without training from scratch, and representation learning involves being robust to data distribution shifts. Specifically, we propose 1) hierarchical novelty detection for detecting and identifying novel classes, 2) continual learning with unlabeled data to overcome catastrophic forgetting when learning the novel classes, 3) network randomization for learning robust representations across visual domain shifts, and 4) domain-agnostic contrastive representation learning, which is robust to data distribution shifts. The first part of this dissertation studies a cycle of lifelong learning. We divide it into two steps and present how we can achieve each step: first, we propose a new novelty detection and classification framework termed hierarchical novelty detection for detecting and identifying novel classes. Then, we show that unlabeled data easily obtainable in the open world are useful to avoid forgetting about the previously learned classes when learning novel classes. We propose a new knowledge distillation method and confidence-based sampling method to effectively leverage the unlabeled data. The second part of this dissertation studies robust representation learning: first, we present a network randomization method to learn an invariant representation across visual changes, particularly effective in deep reinforcement learning. Then, we propose a domain-agnostic robust representation learning method by introducing vicinal risk minimization in contrastive representation learning, which consistently improves the quality of representation and transferability across data distribution shifts.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162981/1/kibok_1.pd
    • …
    corecore