13 research outputs found

    PAEDID: Patch Autoencoder Based Deep Image Decomposition For Pixel-level Defective Region Segmentation

    Full text link
    Unsupervised pixel-level defective region segmentation is an important task in image-based anomaly detection for various industrial applications. The state-of-the-art methods have their own advantages and limitations: matrix-decomposition-based methods are robust to noise but lack complex background image modeling capability; representation-based methods are good at defective region localization but lack accuracy in defective region shape contour extraction; reconstruction-based methods detected defective region match well with the ground truth defective region shape contour but are noisy. To combine the best of both worlds, we present an unsupervised patch autoencoder based deep image decomposition (PAEDID) method for defective region segmentation. In the training stage, we learn the common background as a deep image prior by a patch autoencoder (PAE) network. In the inference stage, we formulate anomaly detection as an image decomposition problem with the deep image prior and domain-specific regularizations. By adopting the proposed approach, the defective regions in the image can be accurately extracted in an unsupervised fashion. We demonstrate the effectiveness of the PAEDID method in simulation studies and an industrial dataset in the case study

    DeSTSeg: Segmentation Guided Denoising Student-Teacher for Anomaly Detection

    Full text link
    Visual anomaly detection, an important problem in computer vision, is usually formulated as a one-class classification and segmentation task. The student-teacher (S-T) framework has proved to be effective in solving this challenge. However, previous works based on S-T only empirically applied constraints on normal data and fused multi-level information. In this study, we propose an improved model called DeSTSeg, which integrates a pre-trained teacher network, a denoising student encoder-decoder, and a segmentation network into one framework. First, to strengthen the constraints on anomalous data, we introduce a denoising procedure that allows the student network to learn more robust representations. From synthetically corrupted normal images, we train the student network to match the teacher network feature of the same images without corruption. Second, to fuse the multi-level S-T features adaptively, we train a segmentation network with rich supervision from synthetic anomaly masks, achieving a substantial performance improvement. Experiments on the industrial inspection benchmark dataset demonstrate that our method achieves state-of-the-art performance, 98.6% on image-level ROC, 75.8% on pixel-level average precision, and 76.4% on instance-level average precision

    VISION Datasets: A Benchmark for Vision-based InduStrial InspectiON

    Full text link
    Despite progress in vision-based inspection algorithms, real-world industrial challenges -- specifically in data availability, quality, and complex production requirements -- often remain under-addressed. We introduce the VISION Datasets, a diverse collection of 14 industrial inspection datasets, uniquely poised to meet these challenges. Unlike previous datasets, VISION brings versatility to defect detection, offering annotation masks across all splits and catering to various detection methodologies. Our datasets also feature instance-segmentation annotation, enabling precise defect identification. With a total of 18k images encompassing 44 defect types, VISION strives to mirror a wide range of real-world production scenarios. By supporting two ongoing challenge competitions on the VISION Datasets, we hope to foster further advancements in vision-based industrial inspection

    VeCLIP: Improving CLIP Training via Visual-enriched Captions

    Full text link
    Large-scale web-crawled datasets are fundamental for the success of pre-training vision-language models, such as CLIP. However, the inherent noise and potential irrelevance of web-crawled AltTexts pose challenges in achieving precise image-text alignment. Existing methods utilizing large language models (LLMs) for caption rewriting have shown promise on small, curated datasets like CC3M and CC12M. This study introduces a scalable pipeline for noisy caption rewriting. Unlike recent LLM rewriting techniques, we emphasize the incorporation of visual concepts into captions, termed as Visual-enriched Captions (VeCap). To ensure data diversity, we propose a novel mixed training scheme that optimizes the utilization of AltTexts alongside newly generated VeCap. We showcase the adaptation of this method for training CLIP on large-scale web-crawled datasets, termed VeCLIP. Employing this cost-effective pipeline, we effortlessly scale our dataset up to 300 million samples named VeCap dataset. Our results show significant advantages in image-text alignment and overall model performance. For example, VeCLIP achieves up to +25.2% gain in COCO and Flickr30k retrieval tasks under the 12M setting. For data efficiency, VeCLIP achieves +3% gain while only using 14% of the data employed in the vanilla CLIP and 11% in ALIGN. We also note the VeCap data is complementary with other well curated datasets good for zero-shot classification tasks. When combining VeCap and DFN, our model can achieve strong performance on both of image-text retrieval and zero-shot classification tasks, e.g. 83.1% accuracy@1 on ImageNet zero-shot for a H/14 model. We release the pre-trained models at https://github.com/apple/ml-veclip.Comment: CV/M

    Ginsenoside F1 attenuates pirarubicin-induced cardiotoxicity by modulating Nrf2 and AKT/Bcl-2 signaling pathways

    No full text
    Background: Pirarubicin (THP) is an anthracycline antibiotic used to treat various malignancies in humans. The clinical usefulness of THP is unfortunately limited by its dose-related cardiotoxicity. Ginsenoside F1 (GF1) is a metabolite formed when the ginsenosides Re and Rg1 are hydrolyzed. However, the protective effects and underlying mechanisms of GF1 on THP-induced cardiotoxicity remain unclear. Methods: We investigated the anti-apoptotic and anti-oxidative stress effects of GF1 on an in vitro model, using H9c2 cells stimulated by THP, plus trigonelline or AKT inhibitor imidazoquinoxaline (IMQ), as well as an in vivo model using THP-induced cardiotoxicity in rats. Using an enzyme-linked immunosorbent test, the levels of malondialdehyde (MDA), brain natriuretic peptide (BNP), creatine kinase (CK-MB), cardiac troponin (c-TnT), lactate dehydrogenase (LDH), superoxide dismutase (SOD) and glutathione (GSH) were determined. Nuclear factor (erythroid-derived2)-like 2 (Nrf2) and the expression of Nrf2 target genes, including heme oxygenase-1 (HO-1), glutathione-S-transferase (Gst), glutamate-cysteine ligase modifier subunit (GCLM), and expression levels of AKT/Bcl-2 signaling pathway proteins were detected using Western blot analysis. Results: THP-induced myocardial histopathological damage, electrocardiogram (ECG) abnormalities, and cardiac dysfunction were reduced in vivo by GF1. GF1 also decreased MDA, BNP, CK-MB, c-TnT, and LDH levels in the serum, while raising SOD and GSH levels. GF1 boosted Nrf2 nuclear translocation and Nrf2 target gene expression, including HO-1, Gst, and GCLM. Furthermore, GF1 regulated apoptosis by activating AKT/Bcl-2 signaling pathways. Employing Nrf2 inhibitor trigonelline and AKT inhibitor IMQ revealed that GF1 lacked antioxidant and anti-apoptotic effects. Conclusion: In conclusion, GF1 was found to alleviate THP-induced cardiotoxicity via modulating Nrf2 and AKT/Bcl-2 signaling pathways, ultimately alleviating myocardial oxidative stress and apoptosis
    corecore