385 research outputs found
VIGraph: Self-supervised Learning for Class-Imbalanced Node Classification
Class imbalance in graph data poses significant challenges for node
classification. Existing methods, represented by SMOTE-based approaches,
partially alleviate this issue but still exhibit limitations during imbalanced
scenario construction. Self-supervised learning (SSL) offers a promising
solution by synthesizing minority nodes from the data itself, yet its potential
remains unexplored. In this paper, we analyze the limitations of SMOTE-based
approaches and introduce VIGraph, a novel SSL model based on the
self-supervised Variational Graph Auto-Encoder (VGAE) that leverages
Variational Inference (VI) to generate minority nodes. Specifically, VIGraph
strictly adheres to the concept of imbalance when constructing imbalanced
graphs and utilizes the generative VGAE to generate minority nodes. Moreover,
VIGraph introduces a novel Siamese contrastive strategy at the decoding phase
to improve the overall quality of generated nodes. VIGraph can generate
high-quality nodes without reintegrating them into the original graph,
eliminating the "Generating, Reintegrating, and Retraining" process found in
SMOTE-based methods. Experiments on multiple real-world datasets demonstrate
that VIGraph achieves promising results for class-imbalanced node
classification tasks
Can Large Language Models Empower Molecular Property Prediction?
Molecular property prediction has gained significant attention due to its
transformative potential in multiple scientific disciplines. Conventionally, a
molecule graph can be represented either as a graph-structured data or a SMILES
text. Recently, the rapid development of Large Language Models (LLMs) has
revolutionized the field of NLP. Although it is natural to utilize LLMs to
assist in understanding molecules represented by SMILES, the exploration of how
LLMs will impact molecular property prediction is still in its early stage. In
this work, we advance towards this objective through two perspectives:
zero/few-shot molecular classification, and using the new explanations
generated by LLMs as representations of molecules. To be specific, we first
prompt LLMs to do in-context molecular classification and evaluate their
performance. After that, we employ LLMs to generate semantically enriched
explanations for the original SMILES and then leverage that to fine-tune a
small-scale LM model for multiple downstream tasks. The experimental results
highlight the superiority of text explanations as molecular representations
across multiple benchmark datasets, and confirm the immense potential of LLMs
in molecular property prediction tasks. Codes are available at
\url{https://github.com/ChnQ/LLM4Mol}
MovePose: A High-performance Human Pose Estimation Algorithm on Mobile and Edge Devices
We present MovePose, an optimized lightweight convolutional neural network
designed specifically for real-time body pose estimation on CPU-based mobile
devices. The current solutions do not provide satisfactory accuracy and speed
for human posture estimation, and MovePose addresses this gap. It aims to
maintain real-time performance while improving the accuracy of human posture
estimation for mobile devices. The network produces 17 keypoints for each
individual at a rate exceeding 11 frames per second, making it suitable for
real-time applications such as fitness tracking, sign language interpretation,
and advanced mobile human posture estimation. Our MovePose algorithm has
attained an Mean Average Precision (mAP) score of 67.7 on the COCO
\cite{cocodata} validation dataset. The MovePose algorithm displayed efficiency
with a performance of 69+ frames per second (fps) when run on an Intel
i9-10920x CPU. Additionally, it showcased an increased performance of 452+ fps
on an NVIDIA RTX3090 GPU. On an Android phone equipped with a Snapdragon 8 + 4G
processor, the fps reached above 11. To enhance accuracy, we incorporated three
techniques: deconvolution, large kernel convolution, and coordinate
classification methods. Compared to basic upsampling, deconvolution is
trainable, improves model capacity, and enhances the receptive field. Large
kernel convolution strengthens these properties at a decreased computational
cost. In summary, MovePose provides high accuracy and real-time performance,
marking it a potential tool for a variety of applications, including those
focused on mobile-side human posture estimation. The code and models for this
algorithm will be made publicly accessible
Rethinking Word-Level Auto-Completion in Computer-Aided Translation
Word-Level Auto-Completion (WLAC) plays a crucial role in Computer-Assisted
Translation. It aims at providing word-level auto-completion suggestions for
human translators. While previous studies have primarily focused on designing
complex model architectures, this paper takes a different perspective by
rethinking the fundamental question: what kind of words are good
auto-completions? We introduce a measurable criterion to answer this question
and discover that existing WLAC models often fail to meet this criterion.
Building upon this observation, we propose an effective approach to enhance
WLAC performance by promoting adherence to the criterion. Notably, the proposed
approach is general and can be applied to various encoder-based architectures.
Through extensive experiments, we demonstrate that our approach outperforms the
top-performing system submitted to the WLAC shared tasks in WMT2022, while
utilizing significantly smaller model sizes.Comment: EMNLP202
HGCVAE: Integrating Generative and Contrastive Learning for Heterogeneous Graph Learning
Generative self-supervised learning (SSL) has exhibited significant potential
and garnered increasing interest in graph learning. In this study, we aim to
explore the problem of generative SSL in the context of heterogeneous graph
learning (HGL). The previous SSL approaches for heterogeneous graphs have
primarily relied on contrastive learning, necessitating the design of complex
views to capture heterogeneity. However, existing generative SSL methods have
not fully leveraged the capabilities of generative models to address the
challenges of HGL. In this paper, we present HGCVAE, a novel contrastive
variational graph auto-encoder that liberates HGL from the burden of intricate
heterogeneity capturing. Instead of focusing on complicated heterogeneity,
HGCVAE harnesses the full potential of generative SSL. HGCVAE innovatively
consolidates contrastive learning with generative SSL, introducing several key
innovations. Firstly, we employ a progressive mechanism to generate
high-quality hard negative samples for contrastive learning, utilizing the
power of variational inference. Additionally, we present a dynamic mask
strategy to ensure effective and stable learning. Moreover, we propose an
enhanced scaled cosine error as the criterion for better attribute
reconstruction. As an initial step in combining generative and contrastive SSL,
HGCVAE achieves remarkable results compared to various state-of-the-art
baselines, confirming its superiority
Isolation of archaeal viruses with lipid membrane from Tengchong acidic hot springs
Archaeal viruses are one of the most mysterious parts of the virosphere because of their diverse morphologies and unique genome contents. The crenarchaeal viruses are commonly found in high temperature and acidic hot springs, and the number of identified crenarchaeal viruses is being rapidly increased in recent two decades. Over fifty viruses infecting the members of the order Sulfolobales have been identified, most of which are from hot springs distributed in the United States, Russia, Iceland, Japan, and Italy. To further expand the reservoir of viruses infecting strains of Sulfolobaceae, we investigated virus diversity through cultivation-dependent approaches in hot springs in Tengchong, Yunnan, China. Eight different virus-like particles were detected in enrichment cultures, among which five new archaeal viruses were isolated and characterized. We showed that these viruses can infect acidophilic hyperthermophiles belonging to three different genera of the family Sulfolobaceae, namely, Saccharolobus, Sulfolobus, and Metallosphaera. We also compared the lipid compositions of the viral and cellular membranes and found that the lipid composition of some viral envelopes was very different from that of the host membrane. Collectively, our results showed that the Tengchong hot springs harbor highly diverse viruses, providing excellent models for archaeal virus-host studies
Do We Really Need Contrastive Learning for Graph Representation?
In recent years, contrastive learning has emerged as a dominant
self-supervised paradigm, attracting numerous research interests in the field
of graph learning. Graph contrastive learning (GCL) aims to embed augmented
anchor samples close to each other while pushing the embeddings of other
samples (negative samples) apart. However, existing GCL methods require large
and diverse negative samples to ensure the quality of embeddings, and recent
studies typically leverage samples excluding the anchor and positive samples as
negative samples, potentially introducing false negative samples (negatives
that share the same class as the anchor). Additionally, this practice can
result in heavy computational burden and high time complexity of ,
which is particularly unaffordable for large graphs. To address these
deficiencies, we leverage rank learning and propose a simple yet effective
model, GraphRank. Specifically, we first generate two graph views through
corruption. Then, we compute the similarity of pairwise nodes (anchor node and
positive node) in both views, an arbitrary node in the latter view is selected
as a negative node, and its similarity with the anchor node is computed. Based
on this, we introduce rank-based learning to measure similarity scores which
successfully relieve the false negative provlem and decreases the time
complexity from to . Moreover, we conducted extensive
experiments across multiple graph tasks, demonstrating that GraphRank performs
favorably against other cutting-edge GCL methods in various tasks
A novel epigenetic AML1-ETO/THAP10/miR-383 mini-circuitry contributes to t(8;21) leukaemogenesis
DNA methylation patterns are frequently deregulated in t(8;21) acute myeloid leukaemia (AML), but little is known of the mechanisms by which specific gene sets become aberrantly methylated. Here, we found that the promoter DNA methylation signature of t(8;21)(+) AML blasts differs from that of t(8;21)(-) AMLs. This study demonstrated that a novel hypermethylated zinc finger-containing protein, THAP10, is a target gene and can be epigenetically suppressed by AML1-ETO at the transcriptional level in t(8;21) AML. Our findings also show that THAP10 is a bona fide target of miR-383 that can be epigenetically activated by the AML1-ETO recruiting co-activator p300. In this study, we demonstrated that epigenetic suppression of THAP10 is the mechanistic link between AML1-ETO fusion proteins and tyrosine kinase cascades. In addition, we showed that THAP10 is a nuclear protein that inhibits myeloid proliferation and promotes differentiation both in vitro and in vivo Altogether, our results revealed an unexpected and important epigenetic mini-circuit of AML1-ETO/THAP10/miR-383 in t(8;21) AML, in which epigenetic suppression of THAP10 predicts a poor clinical outcome and represents a novel therapeutic target
Comparison of the effects of negative pressure wound therapy and negative pressure wound therapy with instillation on wound healing in a porcine model
BackgroundNegative pressure wound therapy with instillation (NPWTi) is a novel method based on standard negative pressure wound therapy (NPWT). This study aimed to compare the effects of standard NPWT and NPWTi on bioburden and wound healing in a Staphylococcus aureus (S.aureus) infected porcine model.MethodsGreen fluorescent protein-labeled S.aureus infected wounds were created on the back of porcine. Wounds were treated with NPWT or NPWT with instillation (saline). The tissue specimens were harvested on days 0 (12 h after bacterial inoculation), 2, 4, 6, and 8 at the center of wound beds. Viable bacterial counts, laser scanning confocal microscopy, PCR, western blot, and histological analysis were performed to assess virulence and wound healing.ResultsThe bacterial count in the NPWTi group was lower than that of the NPWT group and the difference was statistically significant on day 2, day 4, day 6, and day 8 (P < 0.05). The expression levels of agrA, Eap, Spa, and Hla genes of the NPWTi group were significantly lower than that of the NPWT group on day 8 (P < 0.05). The bacterial invasion depth of the NPWTi group was significantly lower than that of the NPWT group on day 2, day 4, day 6, and day 8 (P < 0.05). Though the NPWTi group showed a significantly increased expression of bFGF and VEGF than that of the NPWT group in the early time (P < 0.05), NPWTi cannot lead to better histologic parameters than the NPWT group (P > 0.05).ConclusionOur results demonstrated that NPWTi induced a better decrease in bacterial burden and virulence compared with standard NPWT. These advantages did not result in better histologic parameters on the porcine wound model
- …