423 research outputs found

    Structural Teacher-Student Normality Learning for Multi-Class Anomaly Detection and Localization

    Full text link
    Visual anomaly detection is a challenging open-set task aimed at identifying unknown anomalous patterns while modeling normal data. The knowledge distillation paradigm has shown remarkable performance in one-class anomaly detection by leveraging teacher-student network feature comparisons. However, extending this paradigm to multi-class anomaly detection introduces novel scalability challenges. In this study, we address the significant performance degradation observed in previous teacher-student models when applied to multi-class anomaly detection, which we identify as resulting from cross-class interference. To tackle this issue, we introduce a novel approach known as Structural Teacher-Student Normality Learning (SNL): (1) We propose spatial-channel distillation and intra-&inter-affinity distillation techniques to measure structural distance between the teacher and student networks. (2) We introduce a central residual aggregation module (CRAM) to encapsulate the normal representation space of the student network. We evaluate our proposed approach on two anomaly detection datasets, MVTecAD and VisA. Our method surpasses the state-of-the-art distillation-based algorithms by a significant margin of 3.9% and 1.5% on MVTecAD and 1.2% and 2.5% on VisA in the multi-class anomaly detection and localization tasks, respectively. Furthermore, our algorithm outperforms the current state-of-the-art unified models on both MVTecAD and VisA

    Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data

    Full text link
    Existing one-shot 4D head synthesis methods usually learn from monocular videos with the aid of 3DMM reconstruction, yet the latter is evenly challenging which restricts them from reasonable 4D head synthesis. We present a method to learn one-shot 4D head synthesis via large-scale synthetic data. The key is to first learn a part-wise 4D generative model from monocular images via adversarial learning, to synthesize multi-view images of diverse identities and full motions as training data; then leverage a transformer-based animatable triplane reconstructor to learn 4D head reconstruction using the synthetic data. A novel learning strategy is enforced to enhance the generalizability to real images by disentangling the learning process of 3D reconstruction and reenactment. Experiments demonstrate our superiority over the prior art.Comment: Project page: https://yudeng.github.io/Portrait4D

    BMAD: Benchmarks for Medical Anomaly Detection

    Full text link
    Anomaly detection (AD) is a fundamental research problem in machine learning and computer vision, with practical applications in industrial inspection, video surveillance, and medical diagnosis. In medical imaging, AD is especially vital for detecting and diagnosing anomalies that may indicate rare diseases or conditions. However, there is a lack of a universal and fair benchmark for evaluating AD methods on medical images, which hinders the development of more generalized and robust AD methods in this specific domain. To bridge this gap, we introduce a comprehensive evaluation benchmark for assessing anomaly detection methods on medical images. This benchmark encompasses six reorganized datasets from five medical domains (i.e. brain MRI, liver CT, retinal OCT, chest X-ray, and digital histopathology) and three key evaluation metrics, and includes a total of fourteen state-of-the-art AD algorithms. This standardized and well-curated medical benchmark with the well-structured codebase enables comprehensive comparisons among recently proposed anomaly detection methods. It will facilitate the community to conduct a fair comparison and advance the field of AD on medical imaging. More information on BMAD is available in our GitHub repository: https://github.com/DorisBao/BMA

    Sentinel-Guided Zero-Shot Learning: A Collaborative Paradigm without Real Data Exposure

    Full text link
    With increasing concerns over data privacy and model copyrights, especially in the context of collaborations between AI service providers and data owners, an innovative SG-ZSL paradigm is proposed in this work. SG-ZSL is designed to foster efficient collaboration without the need to exchange models or sensitive data. It consists of a teacher model, a student model and a generator that links both model entities. The teacher model serves as a sentinel on behalf of the data owner, replacing real data, to guide the student model at the AI service provider's end during training. Considering the disparity of knowledge space between the teacher and student, we introduce two variants of the teacher model: the omniscient and the quasi-omniscient teachers. Under these teachers' guidance, the student model seeks to match the teacher model's performance and explores domains that the teacher has not covered. To trade off between privacy and performance, we further introduce two distinct security-level training protocols: white-box and black-box, enhancing the paradigm's adaptability. Despite the inherent challenges of real data absence in the SG-ZSL paradigm, it consistently outperforms in ZSL and GZSL tasks, notably in the white-box protocol. Our comprehensive evaluation further attests to its robustness and efficiency across various setups, including stringent black-box training protocol

    AdaFuse: Adaptive Medical Image Fusion Based on Spatial-Frequential Cross Attention

    Full text link
    Multi-modal medical image fusion is essential for the precise clinical diagnosis and surgical navigation since it can merge the complementary information in multi-modalities into a single image. The quality of the fused image depends on the extracted single modality features as well as the fusion rules for multi-modal information. Existing deep learning-based fusion methods can fully exploit the semantic features of each modality, they cannot distinguish the effective low and high frequency information of each modality and fuse them adaptively. To address this issue, we propose AdaFuse, in which multimodal image information is fused adaptively through frequency-guided attention mechanism based on Fourier transform. Specifically, we propose the cross-attention fusion (CAF) block, which adaptively fuses features of two modalities in the spatial and frequency domains by exchanging key and query values, and then calculates the cross-attention scores between the spatial and frequency features to further guide the spatial-frequential information fusion. The CAF block enhances the high-frequency features of the different modalities so that the details in the fused images can be retained. Moreover, we design a novel loss function composed of structure loss and content loss to preserve both low and high frequency information. Extensive comparison experiments on several datasets demonstrate that the proposed method outperforms state-of-the-art methods in terms of both visual quality and quantitative metrics. The ablation experiments also validate the effectiveness of the proposed loss and fusion strategy

    Multi-objective optimization of Tension Leg Platform using evolutionary algorithm based on surrogate model

    Get PDF
    An Innovative Tension Leg Platform (TLP) Optimization Program, called ITOP, has been developed to solve the multi-objective optimization problem for TLP. We first examine the hydrodynamic behavior of a base TLP for wave headings between 0∘ and 45∘. The numerical results show that the maximum heave and surge motion responses occur in 0∘ wave heading in long-crest waves. It is found that the dynamic tension of No. 8 tendon is larger than the other tendons and reaches its maximum in 45∘ wave heading. It can be attributed to the fact that heave and pitch motions are almost out of phase for wave periods between 10 and 15 s. Because the maximum wave elevation occurs near the northeast column and the vertical motion is very small, the minimum airgap occurs there. Moreover, a surrogate model based on radial basis function (RBF) has been built and adopted to estimate the hydrodynamic performance of TLP. A multi-objective evolutionary algorithm, Non-dominated Sorting Genetic Algorithm II (NSGAII), is employed to find the Pareto-optimal solutions. By comprehensive and systematic computations and analyses, it is revealed that the maximum dynamic tension shows positive correlation with pontoon height and width, but negative correlation with hull draft, column spacing, and column diameter. The most efficient modification strategy for design is proposed to reduce the maximum dynamic tendon tension. According to the strategy, the column spacing, draft, and column diameter should be increased in sequence. By applying this strategy, the maximum dynamic tendon tensions can be reduced while the total weight of the platform is minimized as much as possible

    Abnormal brain spontaneous activity in major depressive disorder adolescents with non-suicidal self injury and its changes after sertraline therapy

    Get PDF
    BackgroundNon-suicidal self-injury (NSSI) commonly occurs among adolescents with major depressive disorder (MDD), causing adverse effects on the physical and mental health of the patients. However, the underlying neurobiological mechanism of NSSI in adolescents with MDD (nsMDDs) remains unclear, and there are still challenges in the treatment. Studies have suggested that sertraline administration could be an effective way for treatment.MethodsTo verify the effectiveness and to explore the neurobiological processes, we treated a group of adolescents with nsMDDs with sertraline in this study. The brain spontaneous activity alteration was then investigated in fifteen unmedicated first-episode adolescent nsMDDs versus twenty-two healthy controls through the resting-state functional magnetic resonance imaging. Besides the baseline scanning for all participants, the nsMDDs group was scanned again after eight weeks of sertraline therapy to examine the changes after treatment.ResultsAt pre-treatment, whole brain analysis of mean amplitude of low-frequency fluctuation (mALFF) was performed to examine the neuronal spontaneous activity alteration, and increased mALFF was found in the superior occipital extending to lingual gyrus in adolescent nsMDDs compared with controls. Meanwhile, decreased mALFF was found in the medial superior frontal in adolescent nsMDDs compared with controls. Compared with the pre-treatment, the nsMDDs group was found to have a trend of, respectively, decreased and increased functional neuronal activity at the two brain areas after treatment through the region of interest analysis. Further, whole brain comparison of mALFF at pre-treatment and post-treatment showed significantly decreased spontaneous activity in the orbital middle frontal and lingual gyrus in adolescent nsMDDs after treatment. Also, depression severity was significantly decreased after treatment.ConclusionThe abnormal functional neuronal activity found at frontal and occipital cortex implied cognitive and affective disturbances in adolescent nsMDDs. The trend of upregulation of frontal neuronal activity and downregulation of occipital neuronal activity after sertraline treatment indicated that the therapy could be effective in regulating the abnormality. Notably, the significantly decreased neuronal activity in the decision related orbital middle frontal and anxiety-depression related lingual gyrus could be suggestive of reduced NSSI in adolescent MDD after therapy
    • …
    corecore