13 research outputs found
PAEDID: Patch Autoencoder Based Deep Image Decomposition For Pixel-level Defective Region Segmentation
Unsupervised pixel-level defective region segmentation is an important task
in image-based anomaly detection for various industrial applications. The
state-of-the-art methods have their own advantages and limitations:
matrix-decomposition-based methods are robust to noise but lack complex
background image modeling capability; representation-based methods are good at
defective region localization but lack accuracy in defective region shape
contour extraction; reconstruction-based methods detected defective region
match well with the ground truth defective region shape contour but are noisy.
To combine the best of both worlds, we present an unsupervised patch
autoencoder based deep image decomposition (PAEDID) method for defective region
segmentation. In the training stage, we learn the common background as a deep
image prior by a patch autoencoder (PAE) network. In the inference stage, we
formulate anomaly detection as an image decomposition problem with the deep
image prior and domain-specific regularizations. By adopting the proposed
approach, the defective regions in the image can be accurately extracted in an
unsupervised fashion. We demonstrate the effectiveness of the PAEDID method in
simulation studies and an industrial dataset in the case study
DeSTSeg: Segmentation Guided Denoising Student-Teacher for Anomaly Detection
Visual anomaly detection, an important problem in computer vision, is usually
formulated as a one-class classification and segmentation task. The
student-teacher (S-T) framework has proved to be effective in solving this
challenge. However, previous works based on S-T only empirically applied
constraints on normal data and fused multi-level information. In this study, we
propose an improved model called DeSTSeg, which integrates a pre-trained
teacher network, a denoising student encoder-decoder, and a segmentation
network into one framework. First, to strengthen the constraints on anomalous
data, we introduce a denoising procedure that allows the student network to
learn more robust representations. From synthetically corrupted normal images,
we train the student network to match the teacher network feature of the same
images without corruption. Second, to fuse the multi-level S-T features
adaptively, we train a segmentation network with rich supervision from
synthetic anomaly masks, achieving a substantial performance improvement.
Experiments on the industrial inspection benchmark dataset demonstrate that our
method achieves state-of-the-art performance, 98.6% on image-level ROC, 75.8%
on pixel-level average precision, and 76.4% on instance-level average
precision
VISION Datasets: A Benchmark for Vision-based InduStrial InspectiON
Despite progress in vision-based inspection algorithms, real-world industrial
challenges -- specifically in data availability, quality, and complex
production requirements -- often remain under-addressed. We introduce the
VISION Datasets, a diverse collection of 14 industrial inspection datasets,
uniquely poised to meet these challenges. Unlike previous datasets, VISION
brings versatility to defect detection, offering annotation masks across all
splits and catering to various detection methodologies. Our datasets also
feature instance-segmentation annotation, enabling precise defect
identification. With a total of 18k images encompassing 44 defect types, VISION
strives to mirror a wide range of real-world production scenarios. By
supporting two ongoing challenge competitions on the VISION Datasets, we hope
to foster further advancements in vision-based industrial inspection
VeCLIP: Improving CLIP Training via Visual-enriched Captions
Large-scale web-crawled datasets are fundamental for the success of
pre-training vision-language models, such as CLIP. However, the inherent noise
and potential irrelevance of web-crawled AltTexts pose challenges in achieving
precise image-text alignment. Existing methods utilizing large language models
(LLMs) for caption rewriting have shown promise on small, curated datasets like
CC3M and CC12M. This study introduces a scalable pipeline for noisy caption
rewriting. Unlike recent LLM rewriting techniques, we emphasize the
incorporation of visual concepts into captions, termed as Visual-enriched
Captions (VeCap). To ensure data diversity, we propose a novel mixed training
scheme that optimizes the utilization of AltTexts alongside newly generated
VeCap. We showcase the adaptation of this method for training CLIP on
large-scale web-crawled datasets, termed VeCLIP. Employing this cost-effective
pipeline, we effortlessly scale our dataset up to 300 million samples named
VeCap dataset. Our results show significant advantages in image-text alignment
and overall model performance. For example, VeCLIP achieves up to +25.2% gain
in COCO and Flickr30k retrieval tasks under the 12M setting. For data
efficiency, VeCLIP achieves +3% gain while only using 14% of the data employed
in the vanilla CLIP and 11% in ALIGN. We also note the VeCap data is
complementary with other well curated datasets good for zero-shot
classification tasks. When combining VeCap and DFN, our model can achieve
strong performance on both of image-text retrieval and zero-shot classification
tasks, e.g. 83.1% accuracy@1 on ImageNet zero-shot for a H/14 model. We release
the pre-trained models at https://github.com/apple/ml-veclip.Comment: CV/M
Ginsenoside F1 attenuates pirarubicin-induced cardiotoxicity by modulating Nrf2 and AKT/Bcl-2 signaling pathways
Background: Pirarubicin (THP) is an anthracycline antibiotic used to treat various malignancies in humans. The clinical usefulness of THP is unfortunately limited by its dose-related cardiotoxicity. Ginsenoside F1 (GF1) is a metabolite formed when the ginsenosides Re and Rg1 are hydrolyzed. However, the protective effects and underlying mechanisms of GF1 on THP-induced cardiotoxicity remain unclear. Methods: We investigated the anti-apoptotic and anti-oxidative stress effects of GF1 on an in vitro model, using H9c2 cells stimulated by THP, plus trigonelline or AKT inhibitor imidazoquinoxaline (IMQ), as well as an in vivo model using THP-induced cardiotoxicity in rats. Using an enzyme-linked immunosorbent test, the levels of malondialdehyde (MDA), brain natriuretic peptide (BNP), creatine kinase (CK-MB), cardiac troponin (c-TnT), lactate dehydrogenase (LDH), superoxide dismutase (SOD) and glutathione (GSH) were determined. Nuclear factor (erythroid-derived2)-like 2 (Nrf2) and the expression of Nrf2 target genes, including heme oxygenase-1 (HO-1), glutathione-S-transferase (Gst), glutamate-cysteine ligase modifier subunit (GCLM), and expression levels of AKT/Bcl-2 signaling pathway proteins were detected using Western blot analysis. Results: THP-induced myocardial histopathological damage, electrocardiogram (ECG) abnormalities, and cardiac dysfunction were reduced in vivo by GF1. GF1 also decreased MDA, BNP, CK-MB, c-TnT, and LDH levels in the serum, while raising SOD and GSH levels. GF1 boosted Nrf2 nuclear translocation and Nrf2 target gene expression, including HO-1, Gst, and GCLM. Furthermore, GF1 regulated apoptosis by activating AKT/Bcl-2 signaling pathways. Employing Nrf2 inhibitor trigonelline and AKT inhibitor IMQ revealed that GF1 lacked antioxidant and anti-apoptotic effects. Conclusion: In conclusion, GF1 was found to alleviate THP-induced cardiotoxicity via modulating Nrf2 and AKT/Bcl-2 signaling pathways, ultimately alleviating myocardial oxidative stress and apoptosis