Search CORE

57 research outputs found

DNA Storage: A Promising Large Scale Archival Storage?

Author: Du David H. C.
Li Bingzhe
Wei Yixun
Publication venue
Publication date: 04/04/2022
Field of study

Deoxyribonucleic Acid (DNA), with its high density and long durability, is a promising storage medium for long-term archival storage and has attracted much attention. Several studies have verified the feasibility of using DNA for archival storage with a small amount of data. However, the achievable storage capacity of DNA as archival storage has not been comprehensively investigated yet. Theoretically, the DNA storage density is about 1 exabyte/mm3 (109 GB/mm3). However, according to our investigation, DNA storage tube capacity based on the current synthesizing and sequencing technologies is only at hundreds of Gigabytes due to the limitation of multiple bio and technology constraints. This paper identifies and investigates the critical factors affecting the single DNA tube capacity for archival storage. Finally, we suggest several promising directions to overcome the limitations and enhance DNA storage capacity

arXiv.org e-Print Archive

Genome-wide investigation and expression analysis of OSCA gene family in response to abiotic stress in alfalfa

Author: Bingzhe Fu
Bingzhe Fu
Bingzhe Fu
Shuxia Li
Shuxia Li
Shuxia Li
Wenqi Cai
Wenxue Song
Xiaohong Li
Xiaotong Wang
Xuxia Ma
Yaling Liu
Publication venue: Frontiers Media S.A.
Publication date: 01/11/2023
Field of study

Alfalfa is an excellent leguminous forage crop that is widely cultivated worldwide, but its yield and quality are often affected by drought and soil salinization. Hyperosmolality-gated calcium-permeable channel (OSCA) proteins are hyperosmotic calcium ion (Ca2+) receptors that play an essential role in regulating plant growth, development, and abiotic stress responses. However, no systematic analysis of the OSCA gene family has been conducted in alfalfa. In this study, a total of 14 OSCA genes were identified from the alfalfa genome and classified into three groups based on their sequence composition and phylogenetic relationships. Gene structure, conserved motifs and functional domain prediction showed that all MsOSCA genes had the same functional domain DUF221. Cis-acting element analysis showed that MsOSCA genes had many cis-regulatory elements in response to abiotic or biotic stresses and hormones. Tissue expression pattern analysis demonstrated that the MsOSCA genes had tissue-specific expression; for example, MsOSCA12 was only expressed in roots and leaves but not in stem and petiole tissues. Furthermore, RT–qPCR results indicated that the expression of MsOSCA genes was induced by abiotic stress (drought and salt) and hormones (JA, SA, and ABA). In particular, the expression levels of MsOSCA3, MsOSCA5, MsOSCA12 and MsOSCA13 were significantly increased under drought and salt stress, and MsOSCA7, MsOSCA10, MsOSCA12 and MsOSCA13 genes exhibited significant upregulation under plant hormone treatments, indicating that these genes play a positive role in drought, salt and hormone responses. Subcellular localization results showed that the MsOSCA3 protein was localized on the plasma membrane. This study provides a basis for understanding the biological information and further functional analysis of the MsOSCA gene family and provides candidate genes for stress resistance breeding in alfalfa

Directory of Open Access Journals

Rethinking and Simplifying Bootstrapped Graph Latents

Author: Bian Yatao
Chen Liang
Li Jintang
Sun Wangbin
Wu Bingzhe
Zheng Zibin
Publication venue
Publication date: 05/12/2023
Field of study

Graph contrastive learning (GCL) has emerged as a representative paradigm in graph self-supervised learning, where negative samples are commonly regarded as the key to preventing model collapse and producing distinguishable representations. Recent studies have shown that GCL without negative samples can achieve state-of-the-art performance as well as scalability improvement, with bootstrapped graph latent (BGRL) as a prominent step forward. However, BGRL relies on a complex architecture to maintain the ability to scatter representations, and the underlying mechanisms enabling the success remain largely unexplored. In this paper, we introduce an instance-level decorrelation perspective to tackle the aforementioned issue and leverage it as a springboard to reveal the potential unnecessary model complexity within BGRL. Based on our findings, we present SGCL, a simple yet effective GCL framework that utilizes the outputs from two consecutive iterations as positive pairs, eliminating the negative samples. SGCL only requires a single graph augmentation and a single graph encoder without additional parameters. Extensive experiments conducted on various graph benchmarks demonstrate that SGCL can achieve competitive performance with fewer parameters, lower time and space costs, and significant convergence speedup.Comment: Accepted by WSDM 202

arXiv.org e-Print Archive

SAILOR: Structural Augmentation Based Tail Node Representation Learning

Author: Bian Yatao
Chen Liang
Li Jintang
Liao Jie
Wu Bingzhe
Zheng Zibin
Publication venue
Publication date: 14/08/2023
Field of study

Graph Neural Networks (GNNs) have achieved state-of-the-art performance in representation learning for graphs recently. However, the effectiveness of GNNs, which capitalize on the key operation of message propagation, highly depends on the quality of the topology structure. Most of the graphs in real-world scenarios follow a long-tailed distribution on their node degrees, that is, a vast majority of the nodes in the graph are tail nodes with only a few connected edges. GNNs produce inferior node representations for tail nodes since they lack structural information. In the pursuit of promoting the expressiveness of GNNs for tail nodes, we explore how the deficiency of structural information deteriorates the performance of tail nodes and propose a general Structural Augmentation based taIL nOde Representation learning framework, dubbed as SAILOR, which can jointly learn to augment the graph structure and extract more informative representations for tail nodes. Extensive experiments on public benchmark datasets demonstrate that SAILOR can significantly improve the tail node representations and outperform the state-of-the-art baselines.Comment: Accepted by CIKM 2023; Code is available at https://github.com/Jie-Re/SAILO

arXiv.org e-Print Archive