3,009 research outputs found
Structural Teacher-Student Normality Learning for Multi-Class Anomaly Detection and Localization
Visual anomaly detection is a challenging open-set task aimed at identifying
unknown anomalous patterns while modeling normal data. The knowledge
distillation paradigm has shown remarkable performance in one-class anomaly
detection by leveraging teacher-student network feature comparisons. However,
extending this paradigm to multi-class anomaly detection introduces novel
scalability challenges. In this study, we address the significant performance
degradation observed in previous teacher-student models when applied to
multi-class anomaly detection, which we identify as resulting from cross-class
interference. To tackle this issue, we introduce a novel approach known as
Structural Teacher-Student Normality Learning (SNL): (1) We propose
spatial-channel distillation and intra-&inter-affinity distillation techniques
to measure structural distance between the teacher and student networks. (2) We
introduce a central residual aggregation module (CRAM) to encapsulate the
normal representation space of the student network. We evaluate our proposed
approach on two anomaly detection datasets, MVTecAD and VisA. Our method
surpasses the state-of-the-art distillation-based algorithms by a significant
margin of 3.9% and 1.5% on MVTecAD and 1.2% and 2.5% on VisA in the multi-class
anomaly detection and localization tasks, respectively. Furthermore, our
algorithm outperforms the current state-of-the-art unified models on both
MVTecAD and VisA
Visual Anomaly Detection via Dual-Attention Transformer and Discriminative Flow
In this paper, we introduce the novel state-of-the-art Dual-attention
Transformer and Discriminative Flow (DADF) framework for visual anomaly
detection. Based on only normal knowledge, visual anomaly detection has wide
applications in industrial scenarios and has attracted significant attention.
However, most existing methods fail to meet the requirements. In contrast, the
proposed DTDF presents a new paradigm: it firstly leverages a pre-trained
network to acquire multi-scale prior embeddings, followed by the development of
a vision Transformer with dual attention mechanisms, namely self-attention and
memorial-attention, to achieve two-level reconstruction for prior embeddings
with the sequential and normality association. Additionally, we propose using
normalizing flow to establish discriminative likelihood for the joint
distribution of prior and reconstructions at each scale. The DADF achieves
98.3/98.4 of image/pixel AUROC on Mvtec AD; 83.7 of image AUROC and 67.4 of
pixel sPRO on Mvtec LOCO AD benchmarks, demonstrating the effectiveness of our
proposed approach.Comment: Submission to IEEE Transactions On Industrial Informatic
OpenPatch: a 3D patchwork for Out-Of-Distribution detection
Moving deep learning models from the laboratory setting to the open world
entails preparing them to handle unforeseen conditions. In several applications
the occurrence of novel classes during deployment poses a significant threat,
thus it is crucial to effectively detect them. Ideally, this skill should be
used when needed without requiring any further computational training effort at
every new task. Out-of-distribution detection has attracted significant
attention in the last years, however the majority of the studies deal with 2D
images ignoring the inherent 3D nature of the real-world and often confusing
between domain and semantic novelty. In this work, we focus on the latter,
considering the objects geometric structure captured by 3D point clouds
regardless of the specific domain. We advance the field by introducing
OpenPatch that builds on a large pre-trained model and simply extracts from its
intermediate features a set of patch representations that describe each known
class. For any new sample, we obtain a novelty score by evaluating whether it
can be recomposed mainly by patches of a single known class or rather via the
contribution of multiple classes. We present an extensive experimental
evaluation of our approach for the task of semantic novelty detection on
real-world point cloud samples when the reference known data are synthetic. We
demonstrate that OpenPatch excels in both the full and few-shot known sample
scenarios, showcasing its robustness across varying pre-training objectives and
network backbones. The inherent training-free nature of our method allows for
its immediate application to a wide array of real-world tasks, offering a
compelling advantage over approaches that need expensive retraining efforts
On Deep Machine Learning Methods for Anomaly Detection within Computer Vision
This thesis concerns deep learning approaches for anomaly detection in images. Anomaly detection addresses how to find any kind of pattern that differs from the regularities found in normal data and is receiving increasingly more attention in deep learning research. This is due in part to its wide set of potential applications ranging from automated CCTV surveillance to quality control across a range of industries. We introduce three original methods for anomaly detection applicable to two specific deployment scenarios. In the first, we detect anomalous activity in potentially crowded scenes through imagery captured via CCTV or other video recording devices. In the second, we segment defects in textures and demonstrate use cases representative of automated quality inspection on industrial production lines. In the context of detecting anomalous activity in scenes, we take an existing state-of-the-art method and introduce several enhancements including the use of a region proposal network for region extraction and a more information-preserving feature preprocessing strategy. This results in a simpler method that is significantly faster and suitable for real-time application. In addition, the increased efficiency facilitates building higher-dimensional models capable of improved anomaly detection performance, which we demonstrate on the pedestrian-based UCSD Ped2 dataset. In the context of texture defect detection, we introduce a method based on the idea of texture restoration that surpasses all state-of-the-art methods on the texture classes of the challenging MVTecAD dataset. In the same context, we additionally introduce a method that utilises transformer networks for future pixel and feature prediction. This novel method is able to perform competitive anomaly detection on most of the challenging MVTecAD dataset texture classes and illustrates both the promise and limitations of state-of-the-art deep learning transformers for the task of texture anomaly detection
- …