423 research outputs found
Structural Teacher-Student Normality Learning for Multi-Class Anomaly Detection and Localization
Visual anomaly detection is a challenging open-set task aimed at identifying
unknown anomalous patterns while modeling normal data. The knowledge
distillation paradigm has shown remarkable performance in one-class anomaly
detection by leveraging teacher-student network feature comparisons. However,
extending this paradigm to multi-class anomaly detection introduces novel
scalability challenges. In this study, we address the significant performance
degradation observed in previous teacher-student models when applied to
multi-class anomaly detection, which we identify as resulting from cross-class
interference. To tackle this issue, we introduce a novel approach known as
Structural Teacher-Student Normality Learning (SNL): (1) We propose
spatial-channel distillation and intra-&inter-affinity distillation techniques
to measure structural distance between the teacher and student networks. (2) We
introduce a central residual aggregation module (CRAM) to encapsulate the
normal representation space of the student network. We evaluate our proposed
approach on two anomaly detection datasets, MVTecAD and VisA. Our method
surpasses the state-of-the-art distillation-based algorithms by a significant
margin of 3.9% and 1.5% on MVTecAD and 1.2% and 2.5% on VisA in the multi-class
anomaly detection and localization tasks, respectively. Furthermore, our
algorithm outperforms the current state-of-the-art unified models on both
MVTecAD and VisA
Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data
Existing one-shot 4D head synthesis methods usually learn from monocular
videos with the aid of 3DMM reconstruction, yet the latter is evenly
challenging which restricts them from reasonable 4D head synthesis. We present
a method to learn one-shot 4D head synthesis via large-scale synthetic data.
The key is to first learn a part-wise 4D generative model from monocular images
via adversarial learning, to synthesize multi-view images of diverse identities
and full motions as training data; then leverage a transformer-based animatable
triplane reconstructor to learn 4D head reconstruction using the synthetic
data. A novel learning strategy is enforced to enhance the generalizability to
real images by disentangling the learning process of 3D reconstruction and
reenactment. Experiments demonstrate our superiority over the prior art.Comment: Project page: https://yudeng.github.io/Portrait4D
BMAD: Benchmarks for Medical Anomaly Detection
Anomaly detection (AD) is a fundamental research problem in machine learning
and computer vision, with practical applications in industrial inspection,
video surveillance, and medical diagnosis. In medical imaging, AD is especially
vital for detecting and diagnosing anomalies that may indicate rare diseases or
conditions. However, there is a lack of a universal and fair benchmark for
evaluating AD methods on medical images, which hinders the development of more
generalized and robust AD methods in this specific domain. To bridge this gap,
we introduce a comprehensive evaluation benchmark for assessing anomaly
detection methods on medical images. This benchmark encompasses six reorganized
datasets from five medical domains (i.e. brain MRI, liver CT, retinal OCT,
chest X-ray, and digital histopathology) and three key evaluation metrics, and
includes a total of fourteen state-of-the-art AD algorithms. This standardized
and well-curated medical benchmark with the well-structured codebase enables
comprehensive comparisons among recently proposed anomaly detection methods. It
will facilitate the community to conduct a fair comparison and advance the
field of AD on medical imaging. More information on BMAD is available in our
GitHub repository: https://github.com/DorisBao/BMA
Sentinel-Guided Zero-Shot Learning: A Collaborative Paradigm without Real Data Exposure
With increasing concerns over data privacy and model copyrights, especially
in the context of collaborations between AI service providers and data owners,
an innovative SG-ZSL paradigm is proposed in this work. SG-ZSL is designed to
foster efficient collaboration without the need to exchange models or sensitive
data. It consists of a teacher model, a student model and a generator that
links both model entities. The teacher model serves as a sentinel on behalf of
the data owner, replacing real data, to guide the student model at the AI
service provider's end during training. Considering the disparity of knowledge
space between the teacher and student, we introduce two variants of the teacher
model: the omniscient and the quasi-omniscient teachers. Under these teachers'
guidance, the student model seeks to match the teacher model's performance and
explores domains that the teacher has not covered. To trade off between privacy
and performance, we further introduce two distinct security-level training
protocols: white-box and black-box, enhancing the paradigm's adaptability.
Despite the inherent challenges of real data absence in the SG-ZSL paradigm, it
consistently outperforms in ZSL and GZSL tasks, notably in the white-box
protocol. Our comprehensive evaluation further attests to its robustness and
efficiency across various setups, including stringent black-box training
protocol
AdaFuse: Adaptive Medical Image Fusion Based on Spatial-Frequential Cross Attention
Multi-modal medical image fusion is essential for the precise clinical
diagnosis and surgical navigation since it can merge the complementary
information in multi-modalities into a single image. The quality of the fused
image depends on the extracted single modality features as well as the fusion
rules for multi-modal information. Existing deep learning-based fusion methods
can fully exploit the semantic features of each modality, they cannot
distinguish the effective low and high frequency information of each modality
and fuse them adaptively. To address this issue, we propose AdaFuse, in which
multimodal image information is fused adaptively through frequency-guided
attention mechanism based on Fourier transform. Specifically, we propose the
cross-attention fusion (CAF) block, which adaptively fuses features of two
modalities in the spatial and frequency domains by exchanging key and query
values, and then calculates the cross-attention scores between the spatial and
frequency features to further guide the spatial-frequential information fusion.
The CAF block enhances the high-frequency features of the different modalities
so that the details in the fused images can be retained. Moreover, we design a
novel loss function composed of structure loss and content loss to preserve
both low and high frequency information. Extensive comparison experiments on
several datasets demonstrate that the proposed method outperforms
state-of-the-art methods in terms of both visual quality and quantitative
metrics. The ablation experiments also validate the effectiveness of the
proposed loss and fusion strategy
Multi-objective optimization of Tension Leg Platform using evolutionary algorithm based on surrogate model
An Innovative Tension Leg Platform (TLP) Optimization Program, called ITOP, has been developed to solve the multi-objective optimization problem for TLP. We first examine the hydrodynamic behavior of a base TLP for wave headings between 0∘ and 45∘. The numerical results show that the maximum heave and surge motion responses occur in 0∘ wave heading in long-crest waves. It is found that the dynamic tension of No. 8 tendon is larger than the other tendons and reaches its maximum in 45∘ wave heading. It can be attributed to the fact that heave and pitch motions are almost out of phase for wave periods between 10 and 15 s. Because the maximum wave elevation occurs near the northeast column and the vertical motion is very small, the minimum airgap occurs there. Moreover, a surrogate model based on radial basis function (RBF) has been built and adopted to estimate the hydrodynamic performance of TLP. A multi-objective evolutionary algorithm, Non-dominated Sorting Genetic Algorithm II (NSGAII), is employed to find the Pareto-optimal solutions. By comprehensive and systematic computations and analyses, it is revealed that the maximum dynamic tension shows positive correlation with pontoon height and width, but negative correlation with hull draft, column spacing, and column diameter. The most efficient modification strategy for design is proposed to reduce the maximum dynamic tendon tension. According to the strategy, the column spacing, draft, and column diameter should be increased in sequence. By applying this strategy, the maximum dynamic tendon tensions can be reduced while the total weight of the platform is minimized as much as possible
Abnormal brain spontaneous activity in major depressive disorder adolescents with non-suicidal self injury and its changes after sertraline therapy
BackgroundNon-suicidal self-injury (NSSI) commonly occurs among adolescents with major depressive disorder (MDD), causing adverse effects on the physical and mental health of the patients. However, the underlying neurobiological mechanism of NSSI in adolescents with MDD (nsMDDs) remains unclear, and there are still challenges in the treatment. Studies have suggested that sertraline administration could be an effective way for treatment.MethodsTo verify the effectiveness and to explore the neurobiological processes, we treated a group of adolescents with nsMDDs with sertraline in this study. The brain spontaneous activity alteration was then investigated in fifteen unmedicated first-episode adolescent nsMDDs versus twenty-two healthy controls through the resting-state functional magnetic resonance imaging. Besides the baseline scanning for all participants, the nsMDDs group was scanned again after eight weeks of sertraline therapy to examine the changes after treatment.ResultsAt pre-treatment, whole brain analysis of mean amplitude of low-frequency fluctuation (mALFF) was performed to examine the neuronal spontaneous activity alteration, and increased mALFF was found in the superior occipital extending to lingual gyrus in adolescent nsMDDs compared with controls. Meanwhile, decreased mALFF was found in the medial superior frontal in adolescent nsMDDs compared with controls. Compared with the pre-treatment, the nsMDDs group was found to have a trend of, respectively, decreased and increased functional neuronal activity at the two brain areas after treatment through the region of interest analysis. Further, whole brain comparison of mALFF at pre-treatment and post-treatment showed significantly decreased spontaneous activity in the orbital middle frontal and lingual gyrus in adolescent nsMDDs after treatment. Also, depression severity was significantly decreased after treatment.ConclusionThe abnormal functional neuronal activity found at frontal and occipital cortex implied cognitive and affective disturbances in adolescent nsMDDs. The trend of upregulation of frontal neuronal activity and downregulation of occipital neuronal activity after sertraline treatment indicated that the therapy could be effective in regulating the abnormality. Notably, the significantly decreased neuronal activity in the decision related orbital middle frontal and anxiety-depression related lingual gyrus could be suggestive of reduced NSSI in adolescent MDD after therapy
- …