129 research outputs found
A new shaft multi-objective optimization dynamic balancing method based on differential search algorithm
Combined with the influence coefficient methods and the Holo-balancing theory of shaft system, a new shaft multi-objective optimization dynamic balancing method including energy, uniformity and maximum of the residual vibration is proposed by building a multi-objective fuzzy evaluation function and application of Differential Search (DS) algorithm. The advantage of DS algorithm is studied by comparing with four other optimization algorithms. And the principle of balancing weight optimization of DS algorithm is studied to realize the shaft dynamic balancing. Finally, the validity and effectiveness of the proposed method is verified through a field power generator set balancing case
Multi-Modal Mutual Attention and Iterative Interaction for Referring Image Segmentation
We address the problem of referring image segmentation that aims to generate
a mask for the object specified by a natural language expression. Many recent
works utilize Transformer to extract features for the target object by
aggregating the attended visual regions. However, the generic attention
mechanism in Transformer only uses the language input for attention weight
calculation, which does not explicitly fuse language features in its output.
Thus, its output feature is dominated by vision information, which limits the
model to comprehensively understand the multi-modal information, and brings
uncertainty for the subsequent mask decoder to extract the output mask. To
address this issue, we propose Multi-Modal Mutual Attention ()
and Multi-Modal Mutual Decoder () that better fuse information
from the two input modalities. Based on {}, we further propose
Iterative Multi-modal Interaction () to allow continuous and
in-depth interactions between language and vision features. Furthermore, we
introduce Language Feature Reconstruction () to prevent the
language information from being lost or distorted in the extracted feature.
Extensive experiments show that our proposed approach significantly improves
the baseline and outperforms state-of-the-art referring image segmentation
methods on RefCOCO series datasets consistently.Comment: IEEE TI
Downregulation of desmuslin in primary vein incompetence
ObjectivePrimary vein incompetence is one of the most common diseases of the peripheral veins, but its pathogenesis is unknown. These veins present obvious congenital defects, and examination of gene expression profiles of the incompetent vein specimens may provide important clues. The aim of this study was to screen for genes affecting the primary vein incompetence phenotype and test the differential expression of certain genes.MethodsWe compared gene expression profiles of valvular areas from incompetent and normal great saphenous veins at the saphenofemoral junctions by fluorescent differential display reverse-transcription polymerase chain reaction (FDD RT-PCR). Differentially expressed complimentary DNAs (cDNAs) were confirmed by Northern blotting and semi-quantitative RT-PCR. Similarity of the cDNAs sequences to GenBank sequences was determined. Gene expression status was then determined by Western blot analysis and immunohistochemical techniques.ResultsThere were >30 differentially expressed cDNA bands. Sequence analysis revealed that a cDNA fragment obviously downregulated in incompetent great saphenous vein was a portion of the messenger RNA (mRNA) encoding desmuslin, a newly discovered intermittent filament protein. Northern blotting and semi-quantitative RT-PCR analysis revealed a similar mRNA expression profile of the desmuslin gene in other samples. Western blotting and immunohistochemical techniques localized the desmuslin protein mainly in the cytoplasm of venous smooth muscle cells. The amount of desmuslin was greatly decreased in the smooth muscle cells of incompetent veins.ConclusionsThe expression of many genes is altered in primary vein incompetence. Up- or downregulation of these genes may be involved in the pathogenesis of this disease. Desmuslin expression is downregulated in the abnormal veins. Its effect on the integrity of smooth muscle cells might be related to malformation of the vein wall. Further studies are needed to investigate other differentially expressed cDNAs and the exact role of desmuslin in this disease.Clinical RelevancePrimary vein incompetence is a frequent and refractory disease of the peripheral veins. Exploring its pathogenesis may enhance our comprehension and management of this disease. We used reliable techniques to detect disease-related genes and confirmed downregulation of desmuslin in abnormal veins. Alteration of these genes might be used as disease markers or gene therapy targets
Federated Incremental Semantic Segmentation
Federated learning-based semantic segmentation (FSS) has drawn widespread
attention via decentralized training on local clients. However, most FSS models
assume categories are fixed in advance, thus heavily undergoing forgetting on
old categories in practical applications where local clients receive new
categories incrementally while have no memory storage to access old classes.
Moreover, new clients collecting novel classes may join in the global training
of FSS, which further exacerbates catastrophic forgetting. To surmount the
above challenges, we propose a Forgetting-Balanced Learning (FBL) model to
address heterogeneous forgetting on old classes from both intra-client and
inter-client aspects. Specifically, under the guidance of pseudo labels
generated via adaptive class-balanced pseudo labeling, we develop a
forgetting-balanced semantic compensation loss and a forgetting-balanced
relation consistency loss to rectify intra-client heterogeneous forgetting of
old categories with background shift. It performs balanced gradient propagation
and relation consistency distillation within local clients. Moreover, to tackle
heterogeneous forgetting from inter-client aspect, we propose a task transition
monitor. It can identify new classes under privacy protection and store the
latest old global model for relation distillation. Qualitative experiments
reveal large improvement of our model against comparison methods. The code is
available at https://github.com/JiahuaDong/FISS.Comment: Accepted to CVPR202
Transformer-Based Visual Segmentation: A Survey
Visual segmentation seeks to partition images, video frames, or point clouds
into multiple segments or groups. This technique has numerous real-world
applications, such as autonomous driving, image editing, robot sensing, and
medical analysis. Over the past decade, deep learning-based methods have made
remarkable strides in this area. Recently, transformers, a type of neural
network based on self-attention originally designed for natural language
processing, have considerably surpassed previous convolutional or recurrent
approaches in various vision processing tasks. Specifically, vision
transformers offer robust, unified, and even simpler solutions for various
segmentation tasks. This survey provides a thorough overview of
transformer-based visual segmentation, summarizing recent advancements. We
first review the background, encompassing problem definitions, datasets, and
prior convolutional methods. Next, we summarize a meta-architecture that
unifies all recent transformer-based approaches. Based on this
meta-architecture, we examine various method designs, including modifications
to the meta-architecture and associated applications. We also present several
closely related settings, including 3D point cloud segmentation, foundation
model tuning, domain-aware segmentation, efficient segmentation, and medical
segmentation. Additionally, we compile and re-evaluate the reviewed methods on
several well-established datasets. Finally, we identify open challenges in this
field and propose directions for future research. The project page can be found
at https://github.com/lxtGH/Awesome-Segmenation-With-Transformer. We will also
continually monitor developments in this rapidly evolving field.Comment: Work in progress. Github:
https://github.com/lxtGH/Awesome-Segmenation-With-Transforme
An Interfacial Layer based on Polymer of Intrinsic Microporosity to Suppress Dendrite Growth on Li‐Metal Anodes
Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge
The open-ended Visual Question Answering (VQA) task requires AI models to
jointly reason over visual and natural language inputs using world knowledge.
Recently, pre-trained Language Models (PLM) such as GPT-3 have been applied to
the task and shown to be powerful world knowledge sources. However, these
methods suffer from low knowledge coverage caused by PLM bias -- the tendency
to generate certain tokens over other tokens regardless of prompt changes, and
high dependency on the PLM quality -- only models using GPT-3 can achieve the
best result.
To address the aforementioned challenges, we propose RASO: a new VQA pipeline
that deploys a generate-then-select strategy guided by world knowledge for the
first time. Rather than following the de facto standard to train a multi-modal
model that directly generates the VQA answer, RASO first adopts PLM to generate
all the possible answers, and then trains a lightweight answer selection model
for the correct answer. As proved in our analysis, RASO expands the knowledge
coverage from in-domain training data by a large margin. We provide extensive
experimentation and show the effectiveness of our pipeline by advancing the
state-of-the-art by 4.1% on OK-VQA, without additional computation cost. Code
and models are released at http://cogcomp.org/page/publication_view/1010Comment: Accepted to ACL 2023 Finding
Towards Open Vocabulary Learning: A Survey
In the field of visual scene understanding, deep neural networks have made
impressive advancements in various core tasks like segmentation, tracking, and
detection. However, most approaches operate on the close-set assumption,
meaning that the model can only identify pre-defined categories that are
present in the training set. Recently, open vocabulary settings were proposed
due to the rapid progress of vision language pre-training. These new approaches
seek to locate and recognize categories beyond the annotated label space. The
open vocabulary approach is more general, practical, and effective compared to
weakly supervised and zero-shot settings. This paper provides a thorough review
of open vocabulary learning, summarizing and analyzing recent developments in
the field. In particular, we begin by comparing it to related concepts such as
zero-shot learning, open-set recognition, and out-of-distribution detection.
Then, we review several closely related tasks in the case of segmentation and
detection, including long-tail problems, few-shot, and zero-shot settings. For
the method survey, we first present the basic knowledge of detection and
segmentation in close-set as the preliminary knowledge. Next, we examine
various scenarios in which open vocabulary learning is used, identifying common
design elements and core ideas. Then, we compare the recent detection and
segmentation approaches in commonly used datasets and benchmarks. Finally, we
conclude with insights, issues, and discussions regarding future research
directions. To our knowledge, this is the first comprehensive literature review
of open vocabulary learning. We keep tracing related works at
https://github.com/jianzongwu/Awesome-Open-Vocabulary.Comment: Project page at https://github.com/jianzongwu/Awesome-Open-Vocabular
- …