Search CORE

237 research outputs found

LISA: Accurate reconstruction of cell trajectory and pseudo-time for massive single cell RNA-seq data.

Author: Chen Yang
Ouyang Zhengqing
Zhang Yuping
Publication venue: The Mouseion at the JAXlibrary
Publication date: 01/01/2019
Field of study

Cell trajectory reconstruction based on single cell RNA sequencing is important for obtaining the landscape of different cell types and discovering cell fate transitions. Despite intense effort, analyzing massive single cell RNA-seq datasets is still challenging. We propose a new method named Landmark Isomap for Single-cell Analysis (LISA). LISA is an unsupervised approach to build cell trajectory and compute pseudo-time in the isometric embedding based on geodesic distances. The advantages of LISA include: (1) It utilizes k-nearest-neighbor graph and hierarchical clustering to identify cell clusters, peaks and valleys in low-dimension representation of the data; (2) Based on Landmark Isomap, it constructs the main geometric structure of cell lineages; (3) It projects cells to the edges of the main cell trajectory to generate the global pseudo-time. Assessments on simulated and real datasets demonstrate the advantages of LISA on cell trajectory and pseudo-time reconstruction compared to Monocle2 and TSCAN. LISA is accurate, fast, and requires less memory usage, allowing its applications to massive single cell datasets generated from current experimental platforms

Crossref

The Jackson Laboratory: The Mouseion at the JAXlibrary

LMDA-Net:A lightweight multi-dimensional attention network for general EEG-based brain-computer interface paradigms and interpretability

Author: Miao Zhengqing
Ming Dong
Zhang Xin
Zhao Meirong
Publication venue
Publication date: 28/03/2023
Field of study

EEG-based recognition of activities and states involves the use of prior neuroscience knowledge to generate quantitative EEG features, which may limit BCI performance. Although neural network-based methods can effectively extract features, they often encounter issues such as poor generalization across datasets, high predicting volatility, and low model interpretability. Hence, we propose a novel lightweight multi-dimensional attention network, called LMDA-Net. By incorporating two novel attention modules designed specifically for EEG signals, the channel attention module and the depth attention module, LMDA-Net can effectively integrate features from multiple dimensions, resulting in improved classification performance across various BCI tasks. LMDA-Net was evaluated on four high-impact public datasets, including motor imagery (MI) and P300-Speller paradigms, and was compared with other representative models. The experimental results demonstrate that LMDA-Net outperforms other representative methods in terms of classification accuracy and predicting volatility, achieving the highest accuracy in all datasets within 300 training epochs. Ablation experiments further confirm the effectiveness of the channel attention module and the depth attention module. To facilitate an in-depth understanding of the features extracted by LMDA-Net, we propose class-specific neural network feature interpretability algorithms that are suitable for event-related potentials (ERPs) and event-related desynchronization/synchronization (ERD/ERS). By mapping the output of the specific layer of LMDA-Net to the time or spatial domain through class activation maps, the resulting feature visualizations can provide interpretable analysis and establish connections with EEG time-spatial analysis in neuroscience. In summary, LMDA-Net shows great potential as a general online decoding model for various EEG tasks.Comment: 20 pages, 7 Figure

arXiv.org e-Print Archive

Fortifying Ethical Boundaries in AI: Advanced Strategies for Enhancing Security in Large Language Models

Author: He Yunhong
Qiu Jianling
Yuan Zhengqing
Zhang Wei
Publication venue
Publication date: 27/01/2024
Field of study

Recent advancements in large language models (LLMs) have significantly enhanced capabilities in natural language processing and artificial intelligence. These models, including GPT-3.5 and LLaMA-2, have revolutionized text generation, translation, and question-answering tasks due to the transformative Transformer model. Despite their widespread use, LLMs present challenges such as ethical dilemmas when models are compelled to respond inappropriately, susceptibility to phishing attacks, and privacy violations. This paper addresses these challenges by introducing a multi-pronged approach that includes: 1) filtering sensitive vocabulary from user input to prevent unethical responses; 2) detecting role-playing to halt interactions that could lead to 'prison break' scenarios; 3) implementing custom rule engines to restrict the generation of prohibited content; and 4) extending these methodologies to various LLM derivatives like Multi-Model Large Language Models (MLLMs). Our approach not only fortifies models against unethical manipulations and privacy breaches but also maintains their high performance across tasks. We demonstrate state-of-the-art performance under various attack prompts, without compromising the model's core functionalities. Furthermore, the introduction of differentiated security levels empowers users to control their personal data disclosure. Our methods contribute to reducing social risks and conflicts arising from technological abuse, enhance data protection, and promote social equity. Collectively, this research provides a framework for balancing the efficiency of question-answering systems with user privacy and ethical standards, ensuring a safer user experience and fostering trust in AI technology

arXiv.org e-Print Archive

PRAS: Predicting functional targets of RNA binding proteins based on CLIP-seq peaks.

Author: Frankel Wayne N
Lin Jianan
Ouyang Zhengqing
Zhang Yuping
Publication venue: The Mouseion at the JAXlibrary
Publication date: 01/08/2019
Field of study

RNA-protein interaction plays important roles in post-transcriptional regulation. Recent advancements in cross-linking and immunoprecipitation followed by sequencing (CLIP-seq) technologies make it possible to detect the binding peaks of a given RNA binding protein (RBP) at transcriptome scale. However, it is still challenging to predict the functional consequences of RBP binding peaks. In this study, we propose the Protein-RNA Association Strength (PRAS), which integrates the intensities and positions of the binding peaks of RBPs for functional mRNA targets prediction. We illustrate the superiority of PRAS over existing approaches on predicting the functional targets of two related but divergent CELF (CUGBP, ELAV-like factor) RBPs in mouse brain and muscle. We also demonstrate the potential of PRAS for wide adoption by applying it to the enhanced CLIP-seq (eCLIP) datasets of 37 RNA decay related RBPs in two human cell lines. PRAS can be utilized to investigate any RBPs with available CLIP-seq peaks. PRAS is freely available at http://ouyanglab.jax.org/pras/

The Jackson Laboratory: The Mouseion at the JAXlibrary

Directory of Open Access Journals

Deciphering the role of RNA structure in translation efficiency.

Author: Chen Yang
Lin Haifan
Lin Jianan
Ouyang Zhengqing
Zhang Yuping
Publication venue: The Mouseion at the JAXlibrary
Publication date: 23/12/2022
Field of study

BACKGROUND: RNA secondary structure has broad impact on the fate of RNA metabolism. The reduced stability of secondary structures near the translation initiation site/start codon of the coding region promotes the efficiency of translation in both prokaryotic and eukaryotic species. However, the inaccuracy of in silico folding and the focus on the coding region limit our understanding of the global relationship between the whole mRNA structure and translation efficiency. Leveraging high-throughput RNA structure probing data in the transcriptome, we aim to systematically investigate the role of RNA structure in regulating translation efficiency. RESULTS: Here, we analyze the influences of hundreds of sequence and structural features on translation efficiency in the mouse embryonic stem cells (mESCs) and zebrafish developmental stages. Our findings reveal that overall in vivo RNA structure has a higher relative importance in predicting translation efficiency than in vitro RNA structure in both mESCs and zebrafish. Also, RNA structures in 3\u27 untranslated region (UTR) have much stronger influence on translation efficiency compared to those in coding regions or 5\u27 UTR. Furthermore, strong alternation between in vitro and in vivo structures in 3\u27 UTR are detected in highly translated mRNAs in mESCs but not zebrafish. Instead, moderate alteration between in vitro and in vivo RNA structures in the 5\u27 UTR and proximal coding regions are detected in highly translated mRNAs in zebrafish. CONCLUSIONS: Our results suggest the openness of the 3\u27 UTR promotes the translation efficiency in both mice and zebrafish, with the in vivo structure in 3\u27 UTR more important in mice than in zebrafish. This reveals a novel role of RNA secondary structure on translational regulation

The Jackson Laboratory: The Mouseion at the JAXlibrary

PubMed Central

RPN: A Word Vector Level Data Augmentation Algorithm in Deep Learning for Language Understanding

Author: Hou Xuecong
Liu Yongming
Wang Yue
Yuan Zhengqing
Zhang Xiaolong
Zhao Zhuanzhe
Publication venue
Publication date: 12/12/2022
Field of study

This paper presents a new data augmentation algorithm for natural understanding tasks, called RPN:Random Position Noise algorithm.Due to the relative paucity of current text augmentation methods. Few of the extant methods apply to natural language understanding tasks for all sentence-level tasks.RPN applies the traditional augmentation on the original text to the word vector level. The RPN algorithm makes a substitution in one or several dimensions of some word vectors. As a result, the RPN can introduce a certain degree of perturbation to the sample and can adjust the range of perturbation on different tasks. The augmented samples are then used to give the model training.This makes the model more robust. In subsequent experiments, we found that adding RPN to the training or fine-tuning model resulted in a stable boost on all 8 natural language processing tasks, including TweetEval, CoLA, and SST-2 datasets, and more significant improvements than other data augmentation algorithms.The RPN algorithm applies to all sentence-level tasks for language understanding and is used in any deep learning model with a word embedding layer.Comment: 10 pages, 4 figure

arXiv.org e-Print Archive

A new graph-based clustering method with application to single-cell RNA-seq data from human pancreatic islets.

Author: Chi Zhiyi
Mao Disheng
Ouyang Zhengqing
Stitzel Michael L
Wu Hao
Zhang Yuping
Publication venue: The Mouseion at the JAXlibrary
Publication date: 01/01/2021
Field of study

Traditional bulk RNA-sequencing of human pancreatic islets mainly reflects transcriptional response of major cell types. Single-cell RNA sequencing technology enables transcriptional characterization of individual cells, and thus makes it possible to detect cell types and subtypes. To tackle the heterogeneity of single-cell RNA-seq data, powerful and appropriate clustering is required to facilitate the discovery of cell types. In this paper, we propose a new clustering framework based on a graph-based model with various types of dissimilarity measures. We take the compositional nature of single-cell RNA-seq data into account and employ log-ratio transformations. The practical merit of the proposed method is demonstrated through the application to the centered log-ratio-transformed single-cell RNA-seq data for human pancreatic islets. The practical merit is also demonstrated through comparisons with existing single-cell clustering methods. The R-package for the proposed method can be found at https://github.com/Zhang-Data-Science-Research-Lab/LrSClust

The Jackson Laboratory: The Mouseion at the JAXlibrary

Sperm-borne small RNA regulate α-tubulin acetylation and epigenetic modification of early bovine somatic cell nuclear transfer embryos

Author: Du Yue
Liu Zhengqing
Ma Xiaonan
Niu Zhihan
Qiao Fang
Qing Suzhu
Qu Pengxiang
Wang Mengyun
Wang Yongsheng
Zhang Ying
Zhang Yong
Zuo Zhenzi
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/05/2019
Field of study

Edinburgh Research Explorer

Recommended from our members

KARR-seq reveals cellular higher-order RNA structures and RNA–RNA interactions

Author: Cheng Anthony Youzhi
Dou Xiaoyang
Fei Jingyi
He Chuan
Li Jianrong
Li Xiao
Liu Bei
Ouyang Zhengqing
Wang Pingluan
Wen Li
Wu Jinjun
Wu Tong
Xu Jiayu
Zhang Linda
Zhang Yuexiu
Publication venue
Publication date: 21/01/2024
Field of study

RNA fate and function are affected by their structures and interactomes. However, how RNA and RNA-binding proteins (RBPs) assemble into higher-order structures and how RNA molecules may interact with each other to facilitate functions remain largely unknown. Here we present KARR-seq, which uses N3-kethoxal labeling and multifunctional chemical crosslinkers to covalently trap and determine RNA–RNA interactions and higher-order RNA structures inside cells, independent of local protein binding to RNA. KARR-seq depicts higher-order RNA structure and detects widespread intermolecular RNA–RNA interactions with high sensitivity and accuracy. Using KARR-seq, we show that translation represses mRNA compaction under native and stress conditions. We determined the higher-order RNA structures of respiratory syncytial virus (RSV) and vesicular stomatitis virus (VSV) and identified RNA–RNA interactions between the viruses and the host RNAs that potentially regulate viral replication

Knowledge UChicago