904 research outputs found

    A Survey on Continual Semantic Segmentation: Theory, Challenge, Method and Application

    Full text link
    Continual learning, also known as incremental learning or life-long learning, stands at the forefront of deep learning and AI systems. It breaks through the obstacle of one-way training on close sets and enables continuous adaptive learning on open-set conditions. In the recent decade, continual learning has been explored and applied in multiple fields especially in computer vision covering classification, detection and segmentation tasks. Continual semantic segmentation (CSS), of which the dense prediction peculiarity makes it a challenging, intricate and burgeoning task. In this paper, we present a review of CSS, committing to building a comprehensive survey on problem formulations, primary challenges, universal datasets, neoteric theories and multifarious applications. Concretely, we begin by elucidating the problem definitions and primary challenges. Based on an in-depth investigation of relevant approaches, we sort out and categorize current CSS models into two main branches including \textit{data-replay} and \textit{data-free} sets. In each branch, the corresponding approaches are similarity-based clustered and thoroughly analyzed, following qualitative comparison and quantitative reproductions on relevant datasets. Besides, we also introduce four CSS specialities with diverse application scenarios and development tendencies. Furthermore, we develop a benchmark for CSS encompassing representative references, evaluation results and reproductions, which is available at~\url{https://github.com/YBIO/SurveyCSS}. We hope this survey can serve as a reference-worthy and stimulating contribution to the advancement of the life-long learning field, while also providing valuable perspectives for related fields.Comment: 20 pages, 12 figures. Undergoing Revie

    CAFE Learning to Condense Dataset by Aligning Features

    Get PDF
    Dataset condensation aims at reducing the network training effort through condensing a cumbersome training set into a compact synthetic one. State-of-the-art approaches largely rely on learning the synthetic data by matching the gradients between the real and synthetic data batches. Despite the intuitive motivation and promising results, such gradient-based methods, by nature, easily overfit to a biased set of samples that produce dominant gradients, and thus lack global supervision of data distribution. In this paper, we propose a novel scheme to Condense dataset by Aligning FEatures (CAFE), which explicitly attempts to preserve the real-feature distribution as well as the discriminant power of the resulting synthetic set, lending itself to strong generalization capability to various architectures. At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales, while accounting for the classification of real samples. Our scheme is further backed up by a novel dynamic bi-level optimization, which adaptively adjusts parameter updates to prevent over-/under-fitting. We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art: on the SVHN dataset, for example, the performance gain is up to 11%. Extensive experiments and analyses verify the effectiveness and necessity of proposed designs.Comment: The manuscript has been accepted by CVPR-2022

    On-line learning with minimal degradation in feedforward networks

    Get PDF
    Dealing with non-stationary processes requires quick adaptation while at the same time avoiding catastrophic forgetting. A neural learning technique that satisfies these requirements, without sacrifying the benefits of distributed representations, is presented. It relies on a formalization of the problem as the minimization of the error over the previously learned input-output (i-o) patterns, subject to the constraint of perfect encoding of the new pattern. Then this constrained optimization problem is transformed into an unconstrained one with hidden-unit activations as variables. This new formulation naturally leads to an algorithm for solving the problem, which we call Learning with Minimal Degradation (LMD). Some experimental comparisons of the performance of LMD with back-propagation are provided which, besides showing the advantages of using LMD, reveal the dependence of forgetting on the learning rate in back-propagation. We also explain why overtraining affects forgetting and fault-tolerance, which are seen as related problems.Peer Reviewe

    THE PSYCHOMETRIC PROPERTIES OF A SOCIAL-EMOTIONAL LEARNING MEASURE

    Get PDF
    Each year many students take college admissions exams (i.e., SAT® and ACT®), hoping to demonstrate their ability to perform at a collegiate level and gain admission to desired universities. However, a growing movement encourages colleges and universities to abandon this practice in their admissions protocol and instead consider alternative factors, such as, social-emotional learning skills, to identify promising applicants. As such, this study examined the psychometric properties of a novel social-emotional learning measure, ACT® Tessera®, which conceptualizes social-emotional traits through the Five-Factor Model lens using different measurement methods (Self Report Likert, Situational Judgement Tests, Forced Choice). Using data obtained from an undergraduate student sample at a metropolitan university, reliability and validity analyses revealed promising evidence for the scale\u27s ability to measure social-emotional skills. However, recommendations for future scale iterations are made to improve the scales\u27 psychometric properties. Then, ACT® Tessera® social-emotional trait measures were assessed alongside traditional college achievement predictors (intelligence, cognitive ability, standardized test scores) to determine their ability to predict undergraduate success. Preliminary evidence provided by this study suggests that considering social-emotional traits in conjunction with high school GPA may provide useful predictions of university success, without standardized test scores. Suggestions for future research and implications for school psychologists are discussed

    TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models

    Full text link
    Aligned large language models (LLMs) demonstrate exceptional capabilities in task-solving, following instructions, and ensuring safety. However, the continual learning aspect of these aligned LLMs has been largely overlooked. Existing continual learning benchmarks lack sufficient challenge for leading aligned LLMs, owing to both their simplicity and the models' potential exposure during instruction tuning. In this paper, we introduce TRACE, a novel benchmark designed to evaluate continual learning in LLMs. TRACE consists of 8 distinct datasets spanning challenging tasks including domain-specific tasks, multilingual capabilities, code generation, and mathematical reasoning. All datasets are standardized into a unified format, allowing for effortless automatic evaluation of LLMs. Our experiments show that after training on TRACE, aligned LLMs exhibit significant declines in both general ability and instruction-following capabilities. For example, the accuracy of llama2-chat 13B on gsm8k dataset declined precipitously from 28.8\% to 2\% after training on our datasets. This highlights the challenge of finding a suitable tradeoff between achieving performance on specific tasks while preserving the original prowess of LLMs. Empirical findings suggest that tasks inherently equipped with reasoning paths contribute significantly to preserving certain capabilities of LLMs against potential declines. Motivated by this, we introduce the Reasoning-augmented Continual Learning (RCL) approach. RCL integrates task-specific cues with meta-rationales, effectively reducing catastrophic forgetting in LLMs while expediting convergence on novel tasks
    corecore