78 research outputs found
ncRNA Classification with Graph Convolutional Networks
Non-coding RNA (ncRNA) are RNA sequences which don't code for a gene but
instead carry important biological functions. The task of ncRNA classification
consists in classifying a given ncRNA sequence into its family. While it has
been shown that the graph structure of an ncRNA sequence folding is of great
importance for the prediction of its family, current methods make use of
machine learning classifiers on hand-crafted graph features. We improve on the
state-of-the-art for this task with a graph convolutional network model which
achieves an accuracy of 85.73% and an F1-score of 85.61% over 13 classes.
Moreover, our model learns in an end-to-end fashion from the raw RNA graphs and
removes the need for expensive feature extraction. To the best of our
knowledge, this also represents the first successful application of graph
convolutional networks to RNA folding data
De novo prediction of RNA-protein interactions with graph neural networks
RNA-binding proteins (RBPs) are key co- and post-transcriptional regulators of gene expression, playing a crucial role in many biological processes. Experimental methods like CLIP-seq have enabled the identification of transcriptome-wide RNA-protein interactions for select proteins; however, the time- and resource-intensive nature of these technologies call for the development of computational methods to complement their predictions. Here, we leverage recent, large-scale CLIP-seq experiments to construct a de novo predictor of RNA-protein interactions based on graph neural networks (GNN). We show that the GNN method allows us not only to predict missing links in an RNA-protein network, but to predict the entire complement of targets of previously unassayed proteins, and even to reconstruct the entire network of RNA-protein interactions in different conditions based on minimal information. Our results demonstrate the potential of modern machine learning methods to extract useful information on post-transcriptional regulation from large data sets
A review of multi-omics data integration through deep learning approaches for disease diagnosis, prognosis, and treatment
Accurate diagnosis is the key to providing prompt and explicit treatment and disease management. The recognized biological method for the molecular diagnosis of infectious pathogens is polymerase chain reaction (PCR). Recently, deep learning approaches are playing a vital role in accurately identifying disease-related genes for diagnosis, prognosis, and treatment. The models reduce the time and cost used by wet-lab experimental procedures. Consequently, sophisticated computational approaches have been developed to facilitate the detection of cancer, a leading cause of death globally, and other complex diseases. In this review, we systematically evaluate the recent trends in multi-omics data analysis based on deep learning techniques and their application in disease prediction. We highlight the current challenges in the field and discuss how advances in deep learning methods and their optimization for application is vital in overcoming them. Ultimately, this review promotes the development of novel deep-learning methodologies for data integration, which is essential for disease detection and treatment
Graph neural networks and attention-based CNN-LSTM for protein classification
This paper focuses on three critical problems on protein classification.
Firstly, Carbohydrate-active enzyme (CAZyme) classification can help people to
understand the properties of enzymes. However, one CAZyme may belong to several
classes. This leads to Multi-label CAZyme classification. Secondly, to capture
information from the secondary structure of protein, protein classification is
modeled as graph classification problem. Thirdly, compound-protein interactions
prediction employs graph learning for compound with sequential embedding for
protein. This can be seen as classification task for compound-protein pairs.
This paper proposes three models for protein classification. Firstly, this
paper proposes a Multi-label CAZyme classification model using CNN-LSTM with
Attention mechanism. Secondly, this paper proposes a variational graph
autoencoder based subspace learning model for protein graph classification.
Thirdly, this paper proposes graph isomorphism networks (GIN) and
Attention-based CNN-LSTM for compound-protein interactions prediction, as well
as comparing GIN with graph convolution networks (GCN) and graph attention
networks (GAT) in this task. The proposed models are effective for protein
classification. Source code and data are available at
https://github.com/zshicode/GNN-AttCL-protein. Besides, this repository
collects and collates the benchmark datasets with respect to above problems,
including CAZyme classification, enzyme protein graph classification,
compound-protein interactions prediction, drug-target affinities prediction and
drug-drug interactions prediction. Hence, the usage for evaluation by benchmark
datasets can be more conveniently
Classification of Noncoding RNA Families using Deep Convolutional Neural Networks
In the last decade, the discovery of noncoding RNA (ncRNA) has exploded. Classifying thesencRNA is critical to determining their function. This thesis proposes a new method employing deep convolutional neural networks (CNNs) to classify ncRNA sequences. To this end, this thesis first proposes an efficient approach to convert the RNA sequences into images characterizing their base-pairing probability. As a result, classifying RNA sequences is converted to an image classification problem that can be efficiently solved by available CNN-based classification models. This thesis also considers the folding potential of the ncRNAs in addition to their primary sequence. Based on the proposed approach, a benchmark image classification dataset is generated from the RFAM database of ncRNA sequences. In addition, three classical CNN models and three Siamese network models have been implemented and compared to demonstrate the superior performance and efficiency of the proposed approach. Extensive experimental results show the great potential of using deep learning approaches for RNA classification
- …