2,247 research outputs found

    A Comprehensive Empirical Evaluation on Online Continual Learning

    Full text link
    Online continual learning aims to get closer to a live learning experience by learning directly on a stream of data with temporally shifting distribution and by storing a minimum amount of data from that stream. In this empirical evaluation, we evaluate various methods from the literature that tackle online continual learning. More specifically, we focus on the class-incremental setting in the context of image classification, where the learner must learn new classes incrementally from a stream of data. We compare these methods on the Split-CIFAR100 and Split-TinyImagenet benchmarks, and measure their average accuracy, forgetting, stability, and quality of the representations, to evaluate various aspects of the algorithm at the end but also during the whole training period. We find that most methods suffer from stability and underfitting issues. However, the learned representations are comparable to i.i.d. training under the same computational budget. No clear winner emerges from the results and basic experience replay, when properly tuned and implemented, is a very strong baseline. We release our modular and extensible codebase at https://github.com/AlbinSou/ocl_survey based on the avalanche framework to reproduce our results and encourage future research.Comment: ICCV Visual Continual Learning Workshop 2023 accepted pape

    Open Set Classification for Deep Learning in Large-Scale and Continual Learning Models

    Get PDF
    Supervised classification methods often assume the train and test data distributions are the same and that all classes in the test set are present in the training set. However, deployed classifiers require the ability to recognize inputs from outside the training set as unknowns and update representations in near real-time to account for novel concepts unknown during offline training. This problem has been studied under multiple paradigms including out-of-distribution detection and open set recognition; however, for convolutional neural networks, there have been two major approaches: 1) inference methods to separate known inputs from unknown inputs and 2) feature space regularization strategies to improve model robustness to novel inputs. In this dissertation, we explore the relationship between the two approaches and directly compare performance on large-scale datasets that have more than a few dozen categories. Using the ImageNet large-scale classification dataset, we identify novel combinations of regularization and specialized inference methods that perform best across multiple open set classification problems of increasing difficulty level. We find that input perturbation and temperature scaling yield significantly better performance on large-scale datasets than other inference methods tested, regardless of the feature space regularization strategy. Conversely, we also find that improving performance with advanced regularization schemes during training yields better performance when baseline inference techniques are used; however, this often requires supplementing the training data with additional background samples which is difficult in large-scale problems. To overcome this problem we further propose a simple regularization technique that can be easily applied to existing convolutional neural network architectures that improves open set robustness without the requirement for a background dataset. Our novel method achieves state-of-the-art results on open set classification baselines and easily scales to large-scale problems. Finally, we explore the intersection of open set and continual learning to establish baselines for the first time for novelty detection while learning from online data streams. To accomplish this we establish a novel dataset created for evaluating image open set classification capabilities of streaming learning algorithms. Finally, using our new baselines we draw conclusions as to what the most computationally efficient means of detecting novelty in pre-trained models and what properties of an efficient open set learning algorithm operating in the streaming paradigm should possess

    Graceful Degradation and Related Fields

    Full text link
    When machine learning models encounter data which is out of the distribution on which they were trained they have a tendency to behave poorly, most prominently over-confidence in erroneous predictions. Such behaviours will have disastrous effects on real-world machine learning systems. In this field graceful degradation refers to the optimisation of model performance as it encounters this out-of-distribution data. This work presents a definition and discussion of graceful degradation and where it can be applied in deployed visual systems. Following this a survey of relevant areas is undertaken, novelly splitting the graceful degradation problem into active and passive approaches. In passive approaches, graceful degradation is handled and achieved by the model in a self-contained manner, in active approaches the model is updated upon encountering epistemic uncertainties. This work communicates the importance of the problem and aims to prompt the development of machine learning strategies that are aware of graceful degradation

    Omnidirectional Transfer for Quasilinear Lifelong Learning

    Full text link
    In biological learning, data are used to improve performance not only on the current task, but also on previously encountered and as yet unencountered tasks. In contrast, classical machine learning starts from a blank slate, or tabula rasa, using data only for the single task at hand. While typical transfer learning algorithms can improve performance on future tasks, their performance on prior tasks degrades upon learning new tasks (called catastrophic forgetting). Many recent approaches for continual or lifelong learning have attempted to maintain performance given new tasks. But striving to avoid forgetting sets the goal unnecessarily low: the goal of lifelong learning, whether biological or artificial, should be to improve performance on all tasks (including past and future) with any new data. We propose omnidirectional transfer learning algorithms, which includes two special cases of interest: decision forests and deep networks. Our key insight is the development of the omni-voter layer, which ensembles representations learned independently on all tasks to jointly decide how to proceed on any given new data point, thereby improving performance on both past and future tasks. Our algorithms demonstrate omnidirectional transfer in a variety of simulated and real data scenarios, including tabular data, image data, spoken data, and adversarial tasks. Moreover, they do so with quasilinear space and time complexity

    On the Domain Adaptation and Generalization of Pretrained Language Models: A Survey

    Full text link
    Recent advances in NLP are brought by a range of large-scale pretrained language models (PLMs). These PLMs have brought significant performance gains for a range of NLP tasks, circumventing the need to customize complex designs for specific tasks. However, most current work focus on finetuning PLMs on a domain-specific datasets, ignoring the fact that the domain gap can lead to overfitting and even performance drop. Therefore, it is practically important to find an appropriate method to effectively adapt PLMs to a target domain of interest. Recently, a range of methods have been proposed to achieve this purpose. Early surveys on domain adaptation are not suitable for PLMs due to the sophisticated behavior exhibited by PLMs from traditional models trained from scratch and that domain adaptation of PLMs need to be redesigned to take effect. This paper aims to provide a survey on these newly proposed methods and shed light in how to apply traditional machine learning methods to newly evolved and future technologies. By examining the issues of deploying PLMs for downstream tasks, we propose a taxonomy of domain adaptation approaches from a machine learning system view, covering methods for input augmentation, model optimization and personalization. We discuss and compare those methods and suggest promising future research directions

    DyAnNet: A Scene Dynamicity Guided Self-Trained Video Anomaly Detection Network

    Full text link
    Unsupervised approaches for video anomaly detection may not perform as good as supervised approaches. However, learning unknown types of anomalies using an unsupervised approach is more practical than a supervised approach as annotation is an extra burden. In this paper, we use isolation tree-based unsupervised clustering to partition the deep feature space of the video segments. The RGB- stream generates a pseudo anomaly score and the flow stream generates a pseudo dynamicity score of a video segment. These scores are then fused using a majority voting scheme to generate preliminary bags of positive and negative segments. However, these bags may not be accurate as the scores are generated only using the current segment which does not represent the global behavior of a typical anomalous event. We then use a refinement strategy based on a cross-branch feed-forward network designed using a popular I3D network to refine both scores. The bags are then refined through a segment re-mapping strategy. The intuition of adding the dynamicity score of a segment with the anomaly score is to enhance the quality of the evidence. The method has been evaluated on three popular video anomaly datasets, i.e., UCF-Crime, CCTV-Fights, and UBI-Fights. Experimental results reveal that the proposed framework achieves competitive accuracy as compared to the state-of-the-art video anomaly detection methods.Comment: 10 pages, 8 figures, and 4 tables. (ACCEPTED AT WACV 2023
    • …
    corecore