30 research outputs found

    Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation

    Full text link
    In this work, we focus on open vocabulary instance segmentation to expand a segmentation model to classify and segment instance-level novel categories. Previous approaches have relied on massive caption datasets and complex pipelines to establish one-to-one mappings between image regions and words in captions. However, such methods build noisy supervision by matching non-visible words to image regions, such as adjectives and verbs. Meanwhile, context words are also important for inferring the existence of novel objects as they show high inter-correlations with novel categories. To overcome these limitations, we devise a joint \textbf{Caption Grounding and Generation (CGG)} framework, which incorporates a novel grounding loss that only focuses on matching object nouns to improve learning efficiency. We also introduce a caption generation head that enables additional supervision and contextual modeling as a complementation to the grounding loss. Our analysis and results demonstrate that grounding and generation components complement each other, significantly enhancing the segmentation performance for novel classes. Experiments on the COCO dataset with two settings: Open Vocabulary Instance Segmentation (OVIS) and Open Set Panoptic Segmentation (OSPS) demonstrate the superiority of the CGG. Specifically, CGG achieves a substantial improvement of 6.8% mAP for novel classes without extra data on the OVIS task and 15% PQ improvements for novel classes on the OSPS benchmark.Comment: ICCV-202

    Transformer-Based Visual Segmentation: A Survey

    Full text link
    Visual segmentation seeks to partition images, video frames, or point clouds into multiple segments or groups. This technique has numerous real-world applications, such as autonomous driving, image editing, robot sensing, and medical analysis. Over the past decade, deep learning-based methods have made remarkable strides in this area. Recently, transformers, a type of neural network based on self-attention originally designed for natural language processing, have considerably surpassed previous convolutional or recurrent approaches in various vision processing tasks. Specifically, vision transformers offer robust, unified, and even simpler solutions for various segmentation tasks. This survey provides a thorough overview of transformer-based visual segmentation, summarizing recent advancements. We first review the background, encompassing problem definitions, datasets, and prior convolutional methods. Next, we summarize a meta-architecture that unifies all recent transformer-based approaches. Based on this meta-architecture, we examine various method designs, including modifications to the meta-architecture and associated applications. We also present several closely related settings, including 3D point cloud segmentation, foundation model tuning, domain-aware segmentation, efficient segmentation, and medical segmentation. Additionally, we compile and re-evaluate the reviewed methods on several well-established datasets. Finally, we identify open challenges in this field and propose directions for future research. The project page can be found at https://github.com/lxtGH/Awesome-Segmenation-With-Transformer. We will also continually monitor developments in this rapidly evolving field.Comment: Work in progress. Github: https://github.com/lxtGH/Awesome-Segmenation-With-Transforme

    Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation

    Get PDF
    In this work, we focus on open vocabulary instance segmentation to expand a segmentation model to classify and segment instance-level novel categories. Previous approaches have relied on massive caption datasets and complex pipelines to establish one-to-one mappings between image regions and words in captions. However, such methods build noisy supervision by matching non-visible words to image regions, such as adjectives and verbs. Meanwhile, context words are also important for inferring the existence of novel objects as they show high inter-correlations with novel categories. To overcome these limitations, we devise a joint Caption Grounding and Generation (CGG) framework, which incorporates a novel grounding loss that only focuses on matching object nouns to improve learning efficiency. We also introduce a caption generation head that enables additional supervision and contextual modeling as a complementation to the grounding loss. Our analysis and results demonstrate that grounding and generation components complement each other, significantly enhancing the segmentation performance for novel classes. Experiments on the COCO dataset with two settings: Open Vocabulary Instance Segmentation (OVIS) and Open Set Panoptic Segmentation (OSPS) demonstrate the superiority of the CGG. Specifically, CGG achieves a substantial improvement of 6.8% mAP for novel classes without extra data on the OVIS task and 15% PQ improvements for novel classes on the OSPS benchmark

    Maternal Diet Intervention Before Pregnancy Primes Offspring Lipid Metabolism in Liver

    Get PDF
    Nonalcoholic fatty liver disease (NAFLD) has a developmental origin and is influenced in utero. We aimed to evaluate if maternal diet intervention before pregnancy would be beneficial to reduce the risk of offspring NAFLD. In our study, female mice were either on a normal-fat diet (NF group), or a high-fat diet for 12 weeks and continued on this diet throughout pregnancy and lactation (HF group), or switched from HF-to-NF diet 1 week (H1N group), or 9 weeks (H9N group) before pregnancy. Compared with the NF offspring, the H1N and HF, but not the H9N offspring, displayed more severe hepatic steatosis and glucose intolerance. More specifically, an abnormal blood lipid panel was seen in the H1N offspring and abnormal hepatic free fatty acid composition was present in both the HF and H1N offspring, while the H9N offspring displayed both at normal levels. These physiological changes were associated with desensitized hepatic insulin/AKT signaling, increased expression of genes and proteins for de novo lipogenesis and cholesterol synthesis, decreased expression of genes and proteins for fatty acid oxidation, increased Pcsk9 expression, and hypoactivation of 5' AMP-activated protein kinase (AMPK) signaling in the HF and H1N offspring. However, these effects were completely or partially rescued in the H9N offspring. In summary, we found that early maternal diet intervention is effective in reducing the risk of offspring NAFLD caused by maternal HF diet. These findings provide significant support to develop effective diet intervention strategies and policies for prevention of obesity and NAFLD to promote optimal health outcomes for mothers and children

    Durability Performance of Basalt Fiber-Reinforced Concrete Subjected to Sulfate–Magnesium Combined Attack

    No full text
    In salt lake areas, cast-in situ concrete structures are subjected to long-term corrosion by sulfate and magnesium ions. The properties of concrete can be improved by adding materials like basalt fiber (BF). To investigate the degradation process and mechanism of cast-in situ concrete with premixed BF under the dual corrosion of sulfate and magnesium salts, concrete with a content of BF ranging from 0 to 0.5% was prepared. Specimens were subjected to different internal and external corrosion conditions and immersed for 180 days. Dimension, mass, and appearance changes at different immersion times were recorded. The compressive and flexural strength of the specimens were tested and continually observed throughout the immersion time. Mineral and microstructural changes at different immersion times were determined by the XRD, TG, and SEM analysis methods. Results indicated that external sulfate–internal magnesium combined attack had a significant negative effect on the early strength. The compressive and flexural strength of the corroded specimens decreased by 17.2% and 14.1%, respectively, compared to the control group at 28 days. The premixed magnesium ions caused the decomposition of the C-S-H gel, resulting in severe spalling and lower mechanical properties after immersing for a long time. As the BF can inhibit crack development, the properties of the concrete premixed with BF were improved. Specimens exhibited superior performance at a BF content of 0.5%, resulting in a 16.2% increase in flexural strength. This paper serves as a valuable reference for the application of basalt fiber-reinforced concrete under the challenging conditions of sulfate–magnesium combined attack

    Structural and Electrochemical Characterization of (NH 4

    No full text

    Circulating Soluble Neuropilin-1 in Patients with Early Cervical Cancer and Cervical Intraepithelial Neoplasia Can Be Used as a Valuable Diagnostic Biomarker

    No full text
    Objective. To investigate soluble neuropilin-1 (sNRP-1) in circulating and NRP-1 protein in cervical tissues from patients with cervical cancer or cervical intraepithelial neoplasia (CIN). Methods. sNRP-1 was measured in 64 preoperative patients and 20 controls. NRP-1 protein in cervical tissue was detected in 56 patients and 20 controls. Results. Both sNRP-1 and NRP-1 proteins were correlated with stage. sNRP-1 presented a high diagnostic ability of cervical cancer and CIN, with a sensitivity of 70.97% and a specificity of 73.68%. Conclusions. sNRP-1 in circulating can serve as a possible valuable diagnostic biomarker for cervical cancer and CIN
    corecore