Search CORE

129 research outputs found

A new shaft multi-objective optimization dynamic balancing method based on differential search algorithm

Author: Guangrui Wen
Henghui Zhang
Xiaoni Dong
Publication venue: 'JVE International Ltd.'
Publication date: 03/11/2014
Field of study

Combined with the influence coefficient methods and the Holo-balancing theory of shaft system, a new shaft multi-objective optimization dynamic balancing method including energy, uniformity and maximum of the residual vibration is proposed by building a multi-objective fuzzy evaluation function and application of Differential Search (DS) algorithm. The advantage of DS algorithm is studied by comparing with four other optimization algorithms. And the principle of balancing weight optimization of DS algorithm is studied to realize the shaft dynamic balancing. Finally, the validity and effectiveness of the proposed method is verified through a field power generator set balancing case

Multi-Modal Mutual Attention and Iterative Interaction for Referring Image Segmentation

Author: Ding Henghui
Jiang Xudong
Liu Chang
Zhang Yulun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/05/2023
Field of study

We address the problem of referring image segmentation that aims to generate a mask for the object specified by a natural language expression. Many recent works utilize Transformer to extract features for the target object by aggregating the attended visual regions. However, the generic attention mechanism in Transformer only uses the language input for attention weight calculation, which does not explicitly fuse language features in its output. Thus, its output feature is dominated by vision information, which limits the model to comprehensively understand the multi-modal information, and brings uncertainty for the subsequent mask decoder to extract the output mask. To address this issue, we propose Multi-Modal Mutual Attention (

\mathrm{M^3Att}

) and Multi-Modal Mutual Decoder (

\mathrm{M^3Dec}

) that better fuse information from the two input modalities. Based on {

\mathrm{M^3Dec}

}, we further propose Iterative Multi-modal Interaction (

\mathrm{IMI}

) to allow continuous and in-depth interactions between language and vision features. Furthermore, we introduce Language Feature Reconstruction (

\mathrm{LFR}

) to prevent the language information from being lost or distorted in the extracted feature. Extensive experiments show that our proposed approach significantly improves the baseline and outperforms state-of-the-art referring image segmentation methods on RefCOCO series datasets consistently.Comment: IEEE TI

arXiv.org e-Print Archive

Downregulation of desmuslin in primary vein incompetence

Author: Liu Qing
Wang Jinsong
Wang Shenming
Yin Henghui
Yin Wei
Zhang Ge
Zhang Xinling
Publication venue: The Society for Vascular Surgery. Published by Mosby, Inc.
Publication date: 28/02/2006
Field of study

ObjectivePrimary vein incompetence is one of the most common diseases of the peripheral veins, but its pathogenesis is unknown. These veins present obvious congenital defects, and examination of gene expression profiles of the incompetent vein specimens may provide important clues. The aim of this study was to screen for genes affecting the primary vein incompetence phenotype and test the differential expression of certain genes.MethodsWe compared gene expression profiles of valvular areas from incompetent and normal great saphenous veins at the saphenofemoral junctions by fluorescent differential display reverse-transcription polymerase chain reaction (FDD RT-PCR). Differentially expressed complimentary DNAs (cDNAs) were confirmed by Northern blotting and semi-quantitative RT-PCR. Similarity of the cDNAs sequences to GenBank sequences was determined. Gene expression status was then determined by Western blot analysis and immunohistochemical techniques.ResultsThere were >30 differentially expressed cDNA bands. Sequence analysis revealed that a cDNA fragment obviously downregulated in incompetent great saphenous vein was a portion of the messenger RNA (mRNA) encoding desmuslin, a newly discovered intermittent filament protein. Northern blotting and semi-quantitative RT-PCR analysis revealed a similar mRNA expression profile of the desmuslin gene in other samples. Western blotting and immunohistochemical techniques localized the desmuslin protein mainly in the cytoplasm of venous smooth muscle cells. The amount of desmuslin was greatly decreased in the smooth muscle cells of incompetent veins.ConclusionsThe expression of many genes is altered in primary vein incompetence. Up- or downregulation of these genes may be involved in the pathogenesis of this disease. Desmuslin expression is downregulated in the abnormal veins. Its effect on the integrity of smooth muscle cells might be related to malformation of the vein wall. Further studies are needed to investigate other differentially expressed cDNAs and the exact role of desmuslin in this disease.Clinical RelevancePrimary vein incompetence is a frequent and refractory disease of the peripheral veins. Exploring its pathogenesis may enhance our comprehension and management of this disease. We used reliable techniques to detect disease-related genes and confirmed downregulation of desmuslin in abnormal veins. Alteration of these genes might be used as disease markers or gene therapy targets

Elsevier - Publisher Connector

Federated Incremental Semantic Segmentation

Author: Cong Wei
Cong Yang
Dai Dengxin
Ding Henghui
Dong Jiahua
Zhang Duzhen
Publication venue
Publication date: 10/04/2023
Field of study

Federated learning-based semantic segmentation (FSS) has drawn widespread attention via decentralized training on local clients. However, most FSS models assume categories are fixed in advance, thus heavily undergoing forgetting on old categories in practical applications where local clients receive new categories incrementally while have no memory storage to access old classes. Moreover, new clients collecting novel classes may join in the global training of FSS, which further exacerbates catastrophic forgetting. To surmount the above challenges, we propose a Forgetting-Balanced Learning (FBL) model to address heterogeneous forgetting on old classes from both intra-client and inter-client aspects. Specifically, under the guidance of pseudo labels generated via adaptive class-balanced pseudo labeling, we develop a forgetting-balanced semantic compensation loss and a forgetting-balanced relation consistency loss to rectify intra-client heterogeneous forgetting of old categories with background shift. It performs balanced gradient propagation and relation consistency distillation within local clients. Moreover, to tackle heterogeneous forgetting from inter-client aspect, we propose a task transition monitor. It can identify new classes under privacy protection and store the latest old global model for relation distillation. Qualitative experiments reveal large improvement of our model against comparison methods. The code is available at https://github.com/JiahuaDong/FISS.Comment: Accepted to CVPR202

arXiv.org e-Print Archive

Transformer-Based Visual Segmentation: A Survey

Author: Chen Kai
Cheng Guangliang
Ding Henghui
Li Xiangtai
Liu Ziwei
Loy Chen Change
Pang Jiangmiao
Yuan Haobo
Zhang Wenwei
Publication venue
Publication date: 19/04/2023
Field of study

Visual segmentation seeks to partition images, video frames, or point clouds into multiple segments or groups. This technique has numerous real-world applications, such as autonomous driving, image editing, robot sensing, and medical analysis. Over the past decade, deep learning-based methods have made remarkable strides in this area. Recently, transformers, a type of neural network based on self-attention originally designed for natural language processing, have considerably surpassed previous convolutional or recurrent approaches in various vision processing tasks. Specifically, vision transformers offer robust, unified, and even simpler solutions for various segmentation tasks. This survey provides a thorough overview of transformer-based visual segmentation, summarizing recent advancements. We first review the background, encompassing problem definitions, datasets, and prior convolutional methods. Next, we summarize a meta-architecture that unifies all recent transformer-based approaches. Based on this meta-architecture, we examine various method designs, including modifications to the meta-architecture and associated applications. We also present several closely related settings, including 3D point cloud segmentation, foundation model tuning, domain-aware segmentation, efficient segmentation, and medical segmentation. Additionally, we compile and re-evaluate the reviewed methods on several well-established datasets. Finally, we identify open challenges in this field and propose directions for future research. The project page can be found at https://github.com/lxtGH/Awesome-Segmenation-With-Transformer. We will also continually monitor developments in this rapidly evolving field.Comment: Work in progress. Github: https://github.com/lxtGH/Awesome-Segmenation-With-Transforme

arXiv.org e-Print Archive

An Interfacial Layer based on Polymer of Intrinsic Microporosity to Suppress Dendrite Growth on Li‐Metal Anodes

Author: Li Wen
Mckeown Neil B.
Pei Hao
Qi Liya
Qu Liangliang
Shang Luoran
Wu Kai
Wu Zhengwei
Yang Zhengjin
Zhang Lexiang
Zhang Weixia
Zhou Henghui
Publication venue: 'Wiley'
Publication date: 03/07/2019
Field of study

Crossref

Edinburgh Research Explorer

Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge

Author: Castelli Vittorio
Fu Xingyu
Kwon Gukyeong
Li Alexander Hanbo
Ng Patrick
Perera Pramuditha
Roth Dan
Wang William Yang
Wang Zhiguo
Xiang Bing
Zhang Sheng
Zhang Yuhao
Zhu Henghui
Publication venue
Publication date: 30/05/2023
Field of study

The open-ended Visual Question Answering (VQA) task requires AI models to jointly reason over visual and natural language inputs using world knowledge. Recently, pre-trained Language Models (PLM) such as GPT-3 have been applied to the task and shown to be powerful world knowledge sources. However, these methods suffer from low knowledge coverage caused by PLM bias -- the tendency to generate certain tokens over other tokens regardless of prompt changes, and high dependency on the PLM quality -- only models using GPT-3 can achieve the best result. To address the aforementioned challenges, we propose RASO: a new VQA pipeline that deploys a generate-then-select strategy guided by world knowledge for the first time. Rather than following the de facto standard to train a multi-modal model that directly generates the VQA answer, RASO first adopts PLM to generate all the possible answers, and then trains a lightweight answer selection model for the correct answer. As proved in our analysis, RASO expands the knowledge coverage from in-domain training data by a large margin. We provide extensive experimentation and show the effectiveness of our pipeline by advancing the state-of-the-art by 4.1% on OK-VQA, without additional computation cost. Code and models are released at http://cogcomp.org/page/publication_view/1010Comment: Accepted to ACL 2023 Finding

arXiv.org e-Print Archive

Towards Open Vocabulary Learning: A Survey

Author: Ding Henghui
Ghanem Bernard
Jiang Xudong
Li Xia
Li Xiangtai
Tao Dacheng
Tong Yunhai
Wu Jianzong
Yang Yibo
Yuan Shilin Xu. Haobo
Zhang Jiangning
Publication venue
Publication date: 27/06/2023
Field of study

In the field of visual scene understanding, deep neural networks have made impressive advancements in various core tasks like segmentation, tracking, and detection. However, most approaches operate on the close-set assumption, meaning that the model can only identify pre-defined categories that are present in the training set. Recently, open vocabulary settings were proposed due to the rapid progress of vision language pre-training. These new approaches seek to locate and recognize categories beyond the annotated label space. The open vocabulary approach is more general, practical, and effective compared to weakly supervised and zero-shot settings. This paper provides a thorough review of open vocabulary learning, summarizing and analyzing recent developments in the field. In particular, we begin by comparing it to related concepts such as zero-shot learning, open-set recognition, and out-of-distribution detection. Then, we review several closely related tasks in the case of segmentation and detection, including long-tail problems, few-shot, and zero-shot settings. For the method survey, we first present the basic knowledge of detection and segmentation in close-set as the preliminary knowledge. Next, we examine various scenarios in which open vocabulary learning is used, identifying common design elements and core ideas. Then, we compare the recent detection and segmentation approaches in commonly used datasets and benchmarks. Finally, we conclude with insights, issues, and discussions regarding future research directions. To our knowledge, this is the first comprehensive literature review of open vocabulary learning. We keep tracing related works at https://github.com/jianzongwu/Awesome-Open-Vocabulary.Comment: Project page at https://github.com/jianzongwu/Awesome-Open-Vocabular

arXiv.org e-Print Archive