Search CORE

745 research outputs found

A Comprehensive Survey on Knowledge Distillation of Diffusion Models

Author: Luo Weijian
Publication venue
Publication date: 09/04/2023
Field of study

Diffusion Models (DMs), also referred to as score-based diffusion models, utilize neural networks to specify score functions. Unlike most other probabilistic models, DMs directly model the score functions, which makes them more flexible to parametrize and potentially highly expressive for probabilistic modeling. DMs can learn fine-grained knowledge, i.e., marginal score functions, of the underlying distribution. Therefore, a crucial research direction is to explore how to distill the knowledge of DMs and fully utilize their potential. Our objective is to provide a comprehensible overview of the modern approaches for distilling DMs, starting with an introduction to DMs and a discussion of the challenges involved in distilling them into neural vector fields. We also provide an overview of the existing works on distilling DMs into both stochastic and deterministic implicit generators. Finally, we review the accelerated diffusion sampling algorithms as a training-free method for distillation. Our tutorial is intended for individuals with a basic understanding of generative models who wish to apply DM's distillation or embark on a research project in this field

arXiv.org e-Print Archive

Knowledge Distillation and Continual Learning for Optimized Deep Neural Networks

Author: Phan Vu Minh Hieu
Publication venue: School of Electrical, Computer and Telecommunications Engineering
Publication date: 01/01/2023
Field of study

Over the past few years, deep learning (DL) has been achieving state-of-theart performance on various human tasks such as speech generation, language translation, image segmentation, and object detection. While traditional machine learning models require hand-crafted features, deep learning algorithms can automatically extract discriminative features and learn complex knowledge from large datasets. This powerful learning ability makes deep learning models attractive to both academia and big corporations. Despite their popularity, deep learning methods still have two main limitations: large memory consumption and catastrophic knowledge forgetting. First, DL algorithms use very deep neural networks (DNNs) with many billion parameters, which have a big model size and a slow inference speed. This restricts the application of DNNs in resource-constraint devices such as mobile phones and autonomous vehicles. Second, DNNs are known to suffer from catastrophic forgetting. When incrementally learning new tasks, the model performance on old tasks significantly drops. The ability to accommodate new knowledge while retaining previously learned knowledge is called continual learning. Since the realworld environments in which the model operates are always evolving, a robust neural network needs to have this continual learning ability for adapting to new changes

Research Online

Integrating State-of-the-Art Approaches for Anomaly Detection and Localization in the Continual Learning Setting

Author: BUGARIC JOVANA
Publication venue
Publication date: 24/10/2023
Field of study

openThe significant attention surrounding the application of anomaly detection (AD) in identifying defects within industrial environments using only normal samples has prompted research and development in this area. However, traditional AD methods have been primarily focused on the current set of examples, resulting in a limitation known as catastrophic forgetting when encountering new tasks. The inflexibility of these methods and the challenges posed by real-world industrial scenarios necessitate the urgent enhancement of the adaptive capabilities of AD models. Therefore, this thesis presents an integrated framework that combines the concepts of continual learning (CL) and anomaly detection (AD) to achieve the objective of anomaly detection in continual learning (ADCL). To evaluate the efficacy of the framework, a thorough comparative analysis is conducted to assess the performance of three specific methods for the AD task: the EfficientAD, Patch Distribution Modeling Framework (PaDiM) and the Discriminatively Trained Reconstruction Anomaly Embedding Model (DRAEM). Moreover, the framework incorporates the use of replay techniques to enable continual learning (CL). In order to determine the superior technique, a comprehensive evaluation is carried out using diverse metrics that measure the relative performance of each method. To validate the proposed approach, a robust real-world dataset called MVTec AD is employed, consisting of images with pixel-based anomalies. This dataset serves as a reliable benchmark for Anomaly Detection in the context of Continual Learning, offering a solid foundation for further advancements in this field of study.The significant attention surrounding the application of anomaly detection (AD) in identifying defects within industrial environments using only normal samples has prompted research and development in this area. However, traditional AD methods have been primarily focused on the current set of examples, resulting in a limitation known as catastrophic forgetting when encountering new tasks. The inflexibility of these methods and the challenges posed by real-world industrial scenarios necessitate the urgent enhancement of the adaptive capabilities of AD models. Therefore, this thesis presents an integrated framework that combines the concepts of continual learning (CL) and anomaly detection (AD) to achieve the objective of anomaly detection in continual learning (ADCL). To evaluate the efficacy of the framework, a thorough comparative analysis is conducted to assess the performance of three specific methods for the AD task: the EfficientAD, Patch Distribution Modeling Framework (PaDiM) and the Discriminatively Trained Reconstruction Anomaly Embedding Model (DRAEM). Moreover, the framework incorporates the use of replay techniques to enable continual learning (CL). In order to determine the superior technique, a comprehensive evaluation is carried out using diverse metrics that measure the relative performance of each method. To validate the proposed approach, a robust real-world dataset called MVTec AD is employed, consisting of images with pixel-based anomalies. This dataset serves as a reliable benchmark for Anomaly Detection in the context of Continual Learning, offering a solid foundation for further advancements in this field of study

Padua Thesis and Dissertation Archive

Comparative Evaluation and Implementation of State-of-the-Art Techniques for Anomaly Detection and Localization in the Continual Learning Framework

Author: BUGARIN NIKOLA
Publication venue
Publication date: 24/10/2023
Field of study

openThe capability of anomaly detection (AD) to detect defects in industrial environments using only normal samples has attracted significant attention. However, traditional AD methods have primarily concentrated on the current set of examples, leading to a significant drawback of catastrophic forgetting when faced with new tasks. Due to the constraints in flexibility and the challenges posed by real-world industrial scenarios, there is an urgent need to strengthen the adaptive capabilities of AD models. Hence, this thesis introduces a unified framework that integrates continual learning (CL) and anomaly detection (AD) to accomplish the goal of anomaly detection in the continual learning (ADCL). To evaluate the effectiveness of the framework, a comparative analysis is performed to assess the performance of the three specific feature-based methods for the AD task: Coupled-Hypersphere-Based Feature Adaptation (CFA), Student-Teacher approach, and PatchCore. Furthermore, the framework incorporates the utilization of replay techniques to facilitate continual learning (CL). A comprehensive evaluation is conducted using a range of metrics to analyze the relative performance of each technique and identify the one that exhibits superior results. To validate the effectiveness of the proposed approach, the MVTec AD dataset, consisting of real-world images with pixel-based anomalies, is utilized. This dataset serves as a reliable benchmark for Anomaly Detection in the context of Continual Learning, providing a solid foundation for further advancements in the field.The capability of anomaly detection (AD) to detect defects in industrial environments using only normal samples has attracted significant attention. However, traditional AD methods have primarily concentrated on the current set of examples, leading to a significant drawback of catastrophic forgetting when faced with new tasks. Due to the constraints in flexibility and the challenges posed by real-world industrial scenarios, there is an urgent need to strengthen the adaptive capabilities of AD models. Hence, this thesis introduces a unified framework that integrates continual learning (CL) and anomaly detection (AD) to accomplish the goal of anomaly detection in the continual learning (ADCL). To evaluate the effectiveness of the framework, a comparative analysis is performed to assess the performance of the three specific feature-based methods for the AD task: Coupled-Hypersphere-Based Feature Adaptation (CFA), Student-Teacher approach, and PatchCore. Furthermore, the framework incorporates the utilization of replay techniques to facilitate continual learning (CL). A comprehensive evaluation is conducted using a range of metrics to analyze the relative performance of each technique and identify the one that exhibits superior results. To validate the effectiveness of the proposed approach, the MVTec AD dataset, consisting of real-world images with pixel-based anomalies, is utilized. This dataset serves as a reliable benchmark for Anomaly Detection in the context of Continual Learning, providing a solid foundation for further advancements in the field

Padua Thesis and Dissertation Archive