Search CORE

375 research outputs found

VIGraph: Self-supervised Learning for Class-Imbalanced Node Classification

Author: Hu Yulan
Liu Yong
Ouyang Sheng
Yang Zhirui
Publication venue
Publication date: 02/11/2023
Field of study

Class imbalance in graph data poses significant challenges for node classification. Existing methods, represented by SMOTE-based approaches, partially alleviate this issue but still exhibit limitations during imbalanced scenario construction. Self-supervised learning (SSL) offers a promising solution by synthesizing minority nodes from the data itself, yet its potential remains unexplored. In this paper, we analyze the limitations of SMOTE-based approaches and introduce VIGraph, a novel SSL model based on the self-supervised Variational Graph Auto-Encoder (VGAE) that leverages Variational Inference (VI) to generate minority nodes. Specifically, VIGraph strictly adheres to the concept of imbalance when constructing imbalanced graphs and utilizes the generative VGAE to generate minority nodes. Moreover, VIGraph introduces a novel Siamese contrastive strategy at the decoding phase to improve the overall quality of generated nodes. VIGraph can generate high-quality nodes without reintegrating them into the original graph, eliminating the "Generating, Reintegrating, and Retraining" process found in SMOTE-based methods. Experiments on multiple real-world datasets demonstrate that VIGraph achieves promising results for class-imbalanced node classification tasks

arXiv.org e-Print Archive

Can Large Language Models Empower Molecular Property Prediction?

Author: Liang Hong
Liu Yong
Qian Chen
Tang Huayi
Yang Zhirui
Publication venue
Publication date: 14/07/2023
Field of study

Molecular property prediction has gained significant attention due to its transformative potential in multiple scientific disciplines. Conventionally, a molecule graph can be represented either as a graph-structured data or a SMILES text. Recently, the rapid development of Large Language Models (LLMs) has revolutionized the field of NLP. Although it is natural to utilize LLMs to assist in understanding molecules represented by SMILES, the exploration of how LLMs will impact molecular property prediction is still in its early stage. In this work, we advance towards this objective through two perspectives: zero/few-shot molecular classification, and using the new explanations generated by LLMs as representations of molecules. To be specific, we first prompt LLMs to do in-context molecular classification and evaluate their performance. After that, we employ LLMs to generate semantically enriched explanations for the original SMILES and then leverage that to fine-tune a small-scale LM model for multiple downstream tasks. The experimental results highlight the superiority of text explanations as molecular representations across multiple benchmark datasets, and confirm the immense potential of LLMs in molecular property prediction tasks. Codes are available at \url{https://github.com/ChnQ/LLM4Mol}

arXiv.org e-Print Archive

MovePose: A High-performance Human Pose Estimation Algorithm on Mobile and Edge Devices

Author: An Wangpeng
Yang Yanhong
Yu Dongyang
Zhang Haoyue
Zhou Zhirui
Publication venue
Publication date: 17/08/2023
Field of study

We present MovePose, an optimized lightweight convolutional neural network designed specifically for real-time body pose estimation on CPU-based mobile devices. The current solutions do not provide satisfactory accuracy and speed for human posture estimation, and MovePose addresses this gap. It aims to maintain real-time performance while improving the accuracy of human posture estimation for mobile devices. The network produces 17 keypoints for each individual at a rate exceeding 11 frames per second, making it suitable for real-time applications such as fitness tracking, sign language interpretation, and advanced mobile human posture estimation. Our MovePose algorithm has attained an Mean Average Precision (mAP) score of 67.7 on the COCO \cite{cocodata} validation dataset. The MovePose algorithm displayed efficiency with a performance of 69+ frames per second (fps) when run on an Intel i9-10920x CPU. Additionally, it showcased an increased performance of 452+ fps on an NVIDIA RTX3090 GPU. On an Android phone equipped with a Snapdragon 8 + 4G processor, the fps reached above 11. To enhance accuracy, we incorporated three techniques: deconvolution, large kernel convolution, and coordinate classification methods. Compared to basic upsampling, deconvolution is trainable, improves model capacity, and enhances the receptive field. Large kernel convolution strengthens these properties at a decreased computational cost. In summary, MovePose provides high accuracy and real-time performance, marking it a potential tool for a variety of applications, including those focused on mobile-side human posture estimation. The code and models for this algorithm will be made publicly accessible

arXiv.org e-Print Archive

Rethinking Word-Level Auto-Completion in Computer-Aided Translation

Author: Chen Xingyu
Huang Guoping
Liu Lemao
Shi Shuming
Wang Rui
Yang Mingming
Zhang Zhirui
Publication venue
Publication date: 24/10/2023
Field of study

Word-Level Auto-Completion (WLAC) plays a crucial role in Computer-Assisted Translation. It aims at providing word-level auto-completion suggestions for human translators. While previous studies have primarily focused on designing complex model architectures, this paper takes a different perspective by rethinking the fundamental question: what kind of words are good auto-completions? We introduce a measurable criterion to answer this question and discover that existing WLAC models often fail to meet this criterion. Building upon this observation, we propose an effective approach to enhance WLAC performance by promoting adherence to the criterion. Notably, the proposed approach is general and can be applied to various encoder-based architectures. Through extensive experiments, we demonstrate that our approach outperforms the top-performing system submitted to the WLAC shared tasks in WMT2022, while utilizing significantly smaller model sizes.Comment: EMNLP202

arXiv.org e-Print Archive

HGCVAE: Integrating Generative and Contrastive Learning for Heterogeneous Graph Learning

Author: Hu Yulan
Liu Yong
Ouyang Sheng
Wan Junchen
Wang Zhongyuan
Yang Zhirui
Zhang Fuzheng
Publication venue
Publication date: 19/10/2023
Field of study

Generative self-supervised learning (SSL) has exhibited significant potential and garnered increasing interest in graph learning. In this study, we aim to explore the problem of generative SSL in the context of heterogeneous graph learning (HGL). The previous SSL approaches for heterogeneous graphs have primarily relied on contrastive learning, necessitating the design of complex views to capture heterogeneity. However, existing generative SSL methods have not fully leveraged the capabilities of generative models to address the challenges of HGL. In this paper, we present HGCVAE, a novel contrastive variational graph auto-encoder that liberates HGL from the burden of intricate heterogeneity capturing. Instead of focusing on complicated heterogeneity, HGCVAE harnesses the full potential of generative SSL. HGCVAE innovatively consolidates contrastive learning with generative SSL, introducing several key innovations. Firstly, we employ a progressive mechanism to generate high-quality hard negative samples for contrastive learning, utilizing the power of variational inference. Additionally, we present a dynamic mask strategy to ensure effective and stable learning. Moreover, we propose an enhanced scaled cosine error as the criterion for better attribute reconstruction. As an initial step in combining generative and contrastive SSL, HGCVAE achieves remarkable results compared to various state-of-the-art baselines, confirming its superiority

arXiv.org e-Print Archive

Isolation of archaeal viruses with lipid membrane from Tengchong acidic hot springs

Author: Chang Tian
Changyi Zhang
Wei Yang
Xi Feng
Xinyu Liu
Yanan Li
Zhirui Zeng
Zhirui Zeng
Publication venue: 'Frontiers Media SA'
Publication date: 01/03/2023
Field of study

Archaeal viruses are one of the most mysterious parts of the virosphere because of their diverse morphologies and unique genome contents. The crenarchaeal viruses are commonly found in high temperature and acidic hot springs, and the number of identified crenarchaeal viruses is being rapidly increased in recent two decades. Over fifty viruses infecting the members of the order Sulfolobales have been identified, most of which are from hot springs distributed in the United States, Russia, Iceland, Japan, and Italy. To further expand the reservoir of viruses infecting strains of Sulfolobaceae, we investigated virus diversity through cultivation-dependent approaches in hot springs in Tengchong, Yunnan, China. Eight different virus-like particles were detected in enrichment cultures, among which five new archaeal viruses were isolated and characterized. We showed that these viruses can infect acidophilic hyperthermophiles belonging to three different genera of the family Sulfolobaceae, namely, Saccharolobus, Sulfolobus, and Metallosphaera. We also compared the lipid compositions of the viral and cellular membranes and found that the lipid composition of some viral envelopes was very different from that of the host membrane. Collectively, our results showed that the Tengchong hot springs harbor highly diverse viruses, providing excellent models for archaeal virus-host studies

Directory of Open Access Journals

Do We Really Need Contrastive Learning for Graph Representation?

Author: Chen Ge
Hu Yulan
Liu Jingyu
Liu Yong
Ouyang Sheng
Wan Junchen
Wang Zhongyuan
Yang Zhirui
Zhang Fuzheng
Publication venue
Publication date: 22/10/2023
Field of study

In recent years, contrastive learning has emerged as a dominant self-supervised paradigm, attracting numerous research interests in the field of graph learning. Graph contrastive learning (GCL) aims to embed augmented anchor samples close to each other while pushing the embeddings of other samples (negative samples) apart. However, existing GCL methods require large and diverse negative samples to ensure the quality of embeddings, and recent studies typically leverage samples excluding the anchor and positive samples as negative samples, potentially introducing false negative samples (negatives that share the same class as the anchor). Additionally, this practice can result in heavy computational burden and high time complexity of

O(N^2)

, which is particularly unaffordable for large graphs. To address these deficiencies, we leverage rank learning and propose a simple yet effective model, GraphRank. Specifically, we first generate two graph views through corruption. Then, we compute the similarity of pairwise nodes (anchor node and positive node) in both views, an arbitrary node in the latter view is selected as a negative node, and its similarity with the anchor node is computed. Based on this, we introduce rank-based learning to measure similarity scores which successfully relieve the false negative provlem and decreases the time complexity from

O(N^2)

O(N)

. Moreover, we conducted extensive experiments across multiple graph tasks, demonstrating that GraphRank performs favorably against other cutting-edge GCL methods in various tasks

arXiv.org e-Print Archive

A novel epigenetic AML1-ETO/THAP10/miR-383 mini-circuitry contributes to t(8;21) leukaemogenesis

Author: Anqi Liu
Clara Nervi
Daihong Liu
De Souza Santos E
Jinlong Shi
Li Gao
Li Yu
Lili Wang
Mengmeng Jiang
Michael Q Zhang
Qiaoyang Ning
Sai Huang
Wenrong Huang
Yang Chen
Yonghui Li
Yu Jing
Yun Dai
Zhirui Hu
Publication venue: 'EMBO'
Publication date: 01/01/2017
Field of study

DNA methylation patterns are frequently deregulated in t(8;21) acute myeloid leukaemia (AML), but little is known of the mechanisms by which specific gene sets become aberrantly methylated. Here, we found that the promoter DNA methylation signature of t(8;21)(+) AML blasts differs from that of t(8;21)(-) AMLs. This study demonstrated that a novel hypermethylated zinc finger-containing protein, THAP10, is a target gene and can be epigenetically suppressed by AML1-ETO at the transcriptional level in t(8;21) AML. Our findings also show that THAP10 is a bona fide target of miR-383 that can be epigenetically activated by the AML1-ETO recruiting co-activator p300. In this study, we demonstrated that epigenetic suppression of THAP10 is the mechanistic link between AML1-ETO fusion proteins and tyrosine kinase cascades. In addition, we showed that THAP10 is a nuclear protein that inhibits myeloid proliferation and promotes differentiation both in vitro and in vivo Altogether, our results revealed an unexpected and important epigenetic mini-circuit of AML1-ETO/THAP10/miR-383 in t(8;21) AML, in which epigenetic suppression of THAP10 predicts a poor clinical outcome and represents a novel therapeutic target

Crossref

Directory of Open Access Journals

Archivio della ricerca- Università di Roma La Sapienza

Comparison of the effects of negative pressure wound therapy and negative pressure wound therapy with instillation on wound healing in a porcine model

Author: An xiao
Feng Xinyue
Li Rui
Li Zhirui
Li Zhirui
Lin Feng
Liu Daohong
Liu Daohong
Liu Daohong
Sun Tingting
Wang Guoqi
Wang Guoqi
Wang Guoqi
Yang Tiantian
Publication venue: 'Frontiers Media SA'
Publication date: 01/04/2023
Field of study

BackgroundNegative pressure wound therapy with instillation (NPWTi) is a novel method based on standard negative pressure wound therapy (NPWT). This study aimed to compare the effects of standard NPWT and NPWTi on bioburden and wound healing in a Staphylococcus aureus (S.aureus) infected porcine model.MethodsGreen fluorescent protein-labeled S.aureus infected wounds were created on the back of porcine. Wounds were treated with NPWT or NPWT with instillation (saline). The tissue specimens were harvested on days 0 (12 h after bacterial inoculation), 2, 4, 6, and 8 at the center of wound beds. Viable bacterial counts, laser scanning confocal microscopy, PCR, western blot, and histological analysis were performed to assess virulence and wound healing.ResultsThe bacterial count in the NPWTi group was lower than that of the NPWT group and the difference was statistically significant on day 2, day 4, day 6, and day 8 (P < 0.05). The expression levels of agrA, Eap, Spa, and Hla genes of the NPWTi group were significantly lower than that of the NPWT group on day 8 (P < 0.05). The bacterial invasion depth of the NPWTi group was significantly lower than that of the NPWT group on day 2, day 4, day 6, and day 8 (P < 0.05). Though the NPWTi group showed a significantly increased expression of bFGF and VEGF than that of the NPWT group in the early time (P < 0.05), NPWTi cannot lead to better histologic parameters than the NPWT group (P > 0.05).ConclusionOur results demonstrated that NPWTi induced a better decrease in bacterial burden and virulence compared with standard NPWT. These advantages did not result in better histologic parameters on the porcine wound model

Directory of Open Access Journals