167 research outputs found

    A Zero-/Few-Shot Anomaly Classification and Segmentation Method for CVPR 2023 VAND Workshop Challenge Tracks 1&2: 1st Place on Zero-shot AD and 4th Place on Few-shot AD

    Full text link
    In this technical report, we briefly introduce our solution for the Zero/Few-shot Track of the Visual Anomaly and Novelty Detection (VAND) 2023 Challenge. For industrial visual inspection, building a single model that can be rapidly adapted to numerous categories without or with only a few normal reference images is a promising research direction. This is primarily because of the vast variety of the product types. For the zero-shot track, we propose a solution based on the CLIP model by adding extra linear layers. These layers are used to map the image features to the joint embedding space, so that they can compare with the text features to generate the anomaly maps. Besides, when the reference images are available, we utilize multiple memory banks to store their features and compare them with the features of the test images during the testing phase. In this challenge, our method achieved first place in the zero-shot track, especially excelling in segmentation with an impressive F1 score improvement of 0.0489 over the second-ranked participant. Furthermore, in the few-shot track, we secured the fourth position overall, with our classification F1 score of 0.8687 ranking first among all participating teams

    Transmission of H7N9 influenza virus in mice by different infective routes.

    Get PDF
    BackgroundOn 19 February 2013, the first patient infected with a novel influenza A H7N9 virus from an avian source showed symptoms of sickness. More than 349 laboratory-confirmed cases and 109 deaths have been reported in mainland China since then. Laboratory-confirmed, human-to-human H7N9 virus transmission has not been documented between individuals having close contact; however, this transmission route could not be excluded for three families. To control the spread of the avian influenza H7N9 virus, we must better understand its pathogenesis, transmissibility, and transmission routes in mammals. Studies have shown that this particular virus is transmitted by aerosols among ferrets.MethodsTo study potential transmission routes in animals with direct or close contact to other animals, we investigated these factors in a murine model.ResultsViable H7N9 avian influenza virus was detected in the upper and lower respiratory tracts, intestine, and brain of model mice. The virus was transmissible between mice in close contact, with a higher concentration of virus found in pharyngeal and ocular secretions, and feces. All these biological materials were contagious for naïve mice.ConclusionsOur results suggest that the possible transmission routes for the H7N9 influenza virus were through mucosal secretions and feces

    Learning Global-aware Kernel for Image Harmonization

    Full text link
    Image harmonization aims to solve the visual inconsistency problem in composited images by adaptively adjusting the foreground pixels with the background as references. Existing methods employ local color transformation or region matching between foreground and background, which neglects powerful proximity prior and independently distinguishes fore-/back-ground as a whole part for harmonization. As a result, they still show a limited performance across varied foreground objects and scenes. To address this issue, we propose a novel Global-aware Kernel Network (GKNet) to harmonize local regions with comprehensive consideration of long-distance background references. Specifically, GKNet includes two parts, \ie, harmony kernel prediction and harmony kernel modulation branches. The former includes a Long-distance Reference Extractor (LRE) to obtain long-distance context and Kernel Prediction Blocks (KPB) to predict multi-level harmony kernels by fusing global information with local features. To achieve this goal, a novel Selective Correlation Fusion (SCF) module is proposed to better select relevant long-distance background references for local harmonization. The latter employs the predicted kernels to harmonize foreground regions with both local and global awareness. Abundant experiments demonstrate the superiority of our method for image harmonization over state-of-the-art methods, \eg, achieving 39.53dB PSNR that surpasses the best counterpart by +0.78dB ↑\uparrow; decreasing fMSE/MSE by 11.5\%↓\downarrow/6.7\%↓\downarrow compared with the SoTA method. Code will be available at \href{https://github.com/XintianShen/GKNet}{here}.Comment: 10 pages, 10 figure

    CLIP-AD: A Language-Guided Staged Dual-Path Model for Zero-shot Anomaly Detection

    Full text link
    This paper considers zero-shot Anomaly Detection (AD), performing AD without reference images of the test objects. We propose a framework called CLIP-AD to leverage the zero-shot capabilities of the large vision-language model CLIP. Firstly, we reinterpret the text prompts design from a distributional perspective and propose a Representative Vector Selection (RVS) paradigm to obtain improved text features. Secondly, we note opposite predictions and irrelevant highlights in the direct computation of the anomaly maps. To address these issues, we introduce a Staged Dual-Path model (SDP) that leverages features from various levels and applies architecture and feature surgery. Lastly, delving deeply into the two phenomena, we point out that the image and text features are not aligned in the joint embedding space. Thus, we introduce a fine-tuning strategy by adding linear layers and construct an extended model SDP+, further enhancing the performance. Abundant experiments demonstrate the effectiveness of our approach, e.g., on MVTec-AD, SDP outperforms the SOTA WinCLIP by +4.2/+10.7 in segmentation metrics F1-max/PRO, while SDP+ achieves +8.3/+20.5 improvements

    T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by Step

    Full text link
    Large language models (LLM) have achieved remarkable performance on various NLP tasks and are augmented by tools for broader applications. Yet, how to evaluate and analyze the tool-utilization capability of LLMs is still under-explored. In contrast to previous works that evaluate models holistically, we comprehensively decompose the tool utilization into multiple sub-processes, including instruction following, planning, reasoning, retrieval, understanding, and review. Based on that, we further introduce T-Eval to evaluate the tool utilization capability step by step. T-Eval disentangles the tool utilization evaluation into several sub-domains along model capabilities, facilitating the inner understanding of both holistic and isolated competency of LLMs. We conduct extensive experiments on T-Eval and in-depth analysis of various LLMs. T-Eval not only exhibits consistency with the outcome-oriented evaluation but also provides a more fine-grained analysis of the capabilities of LLMs, providing a new perspective in LLM evaluation on tool-utilization ability. The benchmark will be available at https://github.com/open-compass/T-Eval.Comment: Project: https://open-compass.github.io/T-Eva
    • …
    corecore