129 research outputs found

    A new shaft multi-objective optimization dynamic balancing method based on differential search algorithm

    Get PDF
    Combined with the influence coefficient methods and the Holo-balancing theory of shaft system, a new shaft multi-objective optimization dynamic balancing method including energy, uniformity and maximum of the residual vibration is proposed by building a multi-objective fuzzy evaluation function and application of Differential Search (DS) algorithm. The advantage of DS algorithm is studied by comparing with four other optimization algorithms. And the principle of balancing weight optimization of DS algorithm is studied to realize the shaft dynamic balancing. Finally, the validity and effectiveness of the proposed method is verified through a field power generator set balancing case

    Multi-Modal Mutual Attention and Iterative Interaction for Referring Image Segmentation

    Full text link
    We address the problem of referring image segmentation that aims to generate a mask for the object specified by a natural language expression. Many recent works utilize Transformer to extract features for the target object by aggregating the attended visual regions. However, the generic attention mechanism in Transformer only uses the language input for attention weight calculation, which does not explicitly fuse language features in its output. Thus, its output feature is dominated by vision information, which limits the model to comprehensively understand the multi-modal information, and brings uncertainty for the subsequent mask decoder to extract the output mask. To address this issue, we propose Multi-Modal Mutual Attention (M3Att\mathrm{M^3Att}) and Multi-Modal Mutual Decoder (M3Dec\mathrm{M^3Dec}) that better fuse information from the two input modalities. Based on {M3Dec\mathrm{M^3Dec}}, we further propose Iterative Multi-modal Interaction (IMI\mathrm{IMI}) to allow continuous and in-depth interactions between language and vision features. Furthermore, we introduce Language Feature Reconstruction (LFR\mathrm{LFR}) to prevent the language information from being lost or distorted in the extracted feature. Extensive experiments show that our proposed approach significantly improves the baseline and outperforms state-of-the-art referring image segmentation methods on RefCOCO series datasets consistently.Comment: IEEE TI

    Downregulation of desmuslin in primary vein incompetence

    Get PDF
    ObjectivePrimary vein incompetence is one of the most common diseases of the peripheral veins, but its pathogenesis is unknown. These veins present obvious congenital defects, and examination of gene expression profiles of the incompetent vein specimens may provide important clues. The aim of this study was to screen for genes affecting the primary vein incompetence phenotype and test the differential expression of certain genes.MethodsWe compared gene expression profiles of valvular areas from incompetent and normal great saphenous veins at the saphenofemoral junctions by fluorescent differential display reverse-transcription polymerase chain reaction (FDD RT-PCR). Differentially expressed complimentary DNAs (cDNAs) were confirmed by Northern blotting and semi-quantitative RT-PCR. Similarity of the cDNAs sequences to GenBank sequences was determined. Gene expression status was then determined by Western blot analysis and immunohistochemical techniques.ResultsThere were >30 differentially expressed cDNA bands. Sequence analysis revealed that a cDNA fragment obviously downregulated in incompetent great saphenous vein was a portion of the messenger RNA (mRNA) encoding desmuslin, a newly discovered intermittent filament protein. Northern blotting and semi-quantitative RT-PCR analysis revealed a similar mRNA expression profile of the desmuslin gene in other samples. Western blotting and immunohistochemical techniques localized the desmuslin protein mainly in the cytoplasm of venous smooth muscle cells. The amount of desmuslin was greatly decreased in the smooth muscle cells of incompetent veins.ConclusionsThe expression of many genes is altered in primary vein incompetence. Up- or downregulation of these genes may be involved in the pathogenesis of this disease. Desmuslin expression is downregulated in the abnormal veins. Its effect on the integrity of smooth muscle cells might be related to malformation of the vein wall. Further studies are needed to investigate other differentially expressed cDNAs and the exact role of desmuslin in this disease.Clinical RelevancePrimary vein incompetence is a frequent and refractory disease of the peripheral veins. Exploring its pathogenesis may enhance our comprehension and management of this disease. We used reliable techniques to detect disease-related genes and confirmed downregulation of desmuslin in abnormal veins. Alteration of these genes might be used as disease markers or gene therapy targets

    Federated Incremental Semantic Segmentation

    Full text link
    Federated learning-based semantic segmentation (FSS) has drawn widespread attention via decentralized training on local clients. However, most FSS models assume categories are fixed in advance, thus heavily undergoing forgetting on old categories in practical applications where local clients receive new categories incrementally while have no memory storage to access old classes. Moreover, new clients collecting novel classes may join in the global training of FSS, which further exacerbates catastrophic forgetting. To surmount the above challenges, we propose a Forgetting-Balanced Learning (FBL) model to address heterogeneous forgetting on old classes from both intra-client and inter-client aspects. Specifically, under the guidance of pseudo labels generated via adaptive class-balanced pseudo labeling, we develop a forgetting-balanced semantic compensation loss and a forgetting-balanced relation consistency loss to rectify intra-client heterogeneous forgetting of old categories with background shift. It performs balanced gradient propagation and relation consistency distillation within local clients. Moreover, to tackle heterogeneous forgetting from inter-client aspect, we propose a task transition monitor. It can identify new classes under privacy protection and store the latest old global model for relation distillation. Qualitative experiments reveal large improvement of our model against comparison methods. The code is available at https://github.com/JiahuaDong/FISS.Comment: Accepted to CVPR202

    Transformer-Based Visual Segmentation: A Survey

    Full text link
    Visual segmentation seeks to partition images, video frames, or point clouds into multiple segments or groups. This technique has numerous real-world applications, such as autonomous driving, image editing, robot sensing, and medical analysis. Over the past decade, deep learning-based methods have made remarkable strides in this area. Recently, transformers, a type of neural network based on self-attention originally designed for natural language processing, have considerably surpassed previous convolutional or recurrent approaches in various vision processing tasks. Specifically, vision transformers offer robust, unified, and even simpler solutions for various segmentation tasks. This survey provides a thorough overview of transformer-based visual segmentation, summarizing recent advancements. We first review the background, encompassing problem definitions, datasets, and prior convolutional methods. Next, we summarize a meta-architecture that unifies all recent transformer-based approaches. Based on this meta-architecture, we examine various method designs, including modifications to the meta-architecture and associated applications. We also present several closely related settings, including 3D point cloud segmentation, foundation model tuning, domain-aware segmentation, efficient segmentation, and medical segmentation. Additionally, we compile and re-evaluate the reviewed methods on several well-established datasets. Finally, we identify open challenges in this field and propose directions for future research. The project page can be found at https://github.com/lxtGH/Awesome-Segmenation-With-Transformer. We will also continually monitor developments in this rapidly evolving field.Comment: Work in progress. Github: https://github.com/lxtGH/Awesome-Segmenation-With-Transforme

    Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge

    Full text link
    The open-ended Visual Question Answering (VQA) task requires AI models to jointly reason over visual and natural language inputs using world knowledge. Recently, pre-trained Language Models (PLM) such as GPT-3 have been applied to the task and shown to be powerful world knowledge sources. However, these methods suffer from low knowledge coverage caused by PLM bias -- the tendency to generate certain tokens over other tokens regardless of prompt changes, and high dependency on the PLM quality -- only models using GPT-3 can achieve the best result. To address the aforementioned challenges, we propose RASO: a new VQA pipeline that deploys a generate-then-select strategy guided by world knowledge for the first time. Rather than following the de facto standard to train a multi-modal model that directly generates the VQA answer, RASO first adopts PLM to generate all the possible answers, and then trains a lightweight answer selection model for the correct answer. As proved in our analysis, RASO expands the knowledge coverage from in-domain training data by a large margin. We provide extensive experimentation and show the effectiveness of our pipeline by advancing the state-of-the-art by 4.1% on OK-VQA, without additional computation cost. Code and models are released at http://cogcomp.org/page/publication_view/1010Comment: Accepted to ACL 2023 Finding

    Towards Open Vocabulary Learning: A Survey

    Full text link
    In the field of visual scene understanding, deep neural networks have made impressive advancements in various core tasks like segmentation, tracking, and detection. However, most approaches operate on the close-set assumption, meaning that the model can only identify pre-defined categories that are present in the training set. Recently, open vocabulary settings were proposed due to the rapid progress of vision language pre-training. These new approaches seek to locate and recognize categories beyond the annotated label space. The open vocabulary approach is more general, practical, and effective compared to weakly supervised and zero-shot settings. This paper provides a thorough review of open vocabulary learning, summarizing and analyzing recent developments in the field. In particular, we begin by comparing it to related concepts such as zero-shot learning, open-set recognition, and out-of-distribution detection. Then, we review several closely related tasks in the case of segmentation and detection, including long-tail problems, few-shot, and zero-shot settings. For the method survey, we first present the basic knowledge of detection and segmentation in close-set as the preliminary knowledge. Next, we examine various scenarios in which open vocabulary learning is used, identifying common design elements and core ideas. Then, we compare the recent detection and segmentation approaches in commonly used datasets and benchmarks. Finally, we conclude with insights, issues, and discussions regarding future research directions. To our knowledge, this is the first comprehensive literature review of open vocabulary learning. We keep tracing related works at https://github.com/jianzongwu/Awesome-Open-Vocabulary.Comment: Project page at https://github.com/jianzongwu/Awesome-Open-Vocabular
    corecore