Search CORE

98,402 research outputs found

Using transfer learning and loss function adaptation for RNA secondary structure prediction

Author: FALDANI GIOVANNI
Publication venue
Publication date: 24/10/2023
Field of study

openThe problem of predicting RNA secondary structure is a challenging topic, which involves various fields of computer science. Accurate solutions to this problem are helpful in the disciplines of medicine for vaccine development, to design stable mRNA molecules, or biology for discerning between different functions of various RNA molecules according to their shape. The objective of this project is to study an emerging Machine Learning-based approach to the problem of RNA secondary structure prediction via integration of deep learning techniques like transfer learning and convolutional neural net- works, aided by adaptations made for the specific problem at hand, like data representation and loss function. The objective of this project is to provide a new robust Machine Learning-based approach to the problem of RNA secondary structure prediction via integration and improvement of emerging Deep Learning techniques.The problem of predicting RNA secondary structure is a challenging topic, which involves various fields of computer science. Accurate solutions to this problem are helpful in the disciplines of medicine for vaccine development, to design stable mRNA molecules, or biology for discerning between different functions of various RNA molecules according to their shape. The objective of this project is to study an emerging Machine Learning-based approach to the problem of RNA secondary structure prediction via integration of deep learning techniques like transfer learning and convolutional neural net- works, aided by adaptations made for the specific problem at hand, like data representation and loss function. The objective of this project is to provide a new robust Machine Learning-based approach to the problem of RNA secondary structure prediction via integration and improvement of emerging Deep Learning techniques

Padua Thesis and Dissertation Archive

Generative Tertiary Structure-based RNA Design

Author: Gao Zhangyang
Li Stan Z.
Tan Cheng
Publication venue
Publication date: 25/01/2023
Field of study

Learning from 3D biological macromolecules with artificial intelligence technologies has been an emerging area. Computational protein design, known as the inverse of protein structure prediction, aims to generate protein sequences that will fold into the defined structure. Analogous to protein design, RNA design is also an important topic in synthetic biology, which aims to generate RNA sequences by given structures. However, existing RNA design methods mainly focus on the secondary structure, ignoring the informative tertiary structure, which is commonly used in protein design. To explore the complex coupling between RNA sequence and 3D structure, we introduce an RNA tertiary structure modeling method to efficiently capture useful information from the 3D structure of RNA. For a fair comparison, we collect abundant RNA data and split the data according to tertiary structures. With the standard dataset, we conduct a benchmark by employing structure-based protein design approaches with our RNA tertiary structure modeling method. We believe our work will stimulate the future development of tertiary structure-based RNA design and bridge the gap between the RNA 3D structures and sequences

arXiv.org e-Print Archive

Identification of RNA Binding Proteins and RNA Binding Residues Using Effective Machine Learning Techniques

Author: Khanal Reecha
Publication venue: ScholarWorks@UNO
Publication date: 01/04/2019
Field of study

Identification and annotation of RNA Binding Proteins (RBPs) and RNA Binding residues from sequence information alone is one of the most challenging problems in computational biology. RBPs play crucial roles in several fundamental biological functions including transcriptional regulation of RNAs and RNA metabolism splicing. Existing experimental techniques are time-consuming and costly. Thus, efficient computational identification of RBPs directly from the sequence can be useful to annotate RBP and assist the experimental design. Here, we introduce AIRBP, a computational sequence-based method, which utilizes features extracted from evolutionary information, physiochemical properties, and disordered properties to train a machine learning method designed using stacking, an advanced machine learning technique, for effective prediction of RBPs. Furthermore, it makes use of efficient machine learning algorithms like Support Vector Machine, Logistic Regression, K-Nearest Neighbor and XGBoost (Extreme Gradient Boosting Algorithm). In this research work, we also propose another predictor for efficient annotation of RBP residues. This RBP residue predictor also uses stacking and evolutionary algorithms for efficient annotation of RBPs and RNA Binding residue. The RNA-binding residue predictor also utilizes various evolutionary, physicochemical and disordered properties to train a robust model. This thesis presents a possible solution to the RBP and RNA binding residue prediction problem through two independent predictors, both of which outperform existing state-of-the-art approaches

Identification of RNA Binding Proteins and RNA Binding Residues Using Effective Machine Learning Techniques

Author: Khanal Reecha
Publication venue: ScholarWorks@UNO
Publication date: 01/04/2019
Field of study

University of New Orleans

Deep learning models for predicting RNA degradation via dual crowdsourcing

Author: Amer Karim
Chiu King Yuen
Das Rhiju
Demkin Maggie
Fares Mohamed
Fujikawa Kazuki
Gao Jiayang
He Shujun
Ishi Keiichiro
Ito Takuya
Kim Do Soon
Kladwang Wipapat
Lee Youhan
Mao Hanfei
Nicol John J.
Noumi Taiga
Onodera Kazuki
Reade Walter
Romano Jonathan
Steenwinckel Bram
Tinti Michele
Tunguz Bojan
Vandewiele Gilles
Watkins Andrew M.
Wayment-Steele Hannah K.
Wellington-Oguri Roger
Öztürk Emin
Öztürk Fatih
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Medicines based on messenger RNA (mRNA) hold immense potential, as evidenced by their rapid deployment as COVID-19 vaccines. However, worldwide distribution of mRNA molecules has been limited by their thermostability, which is fundamentally limited by the intrinsic instability of RNA molecules to a chemical degradation reaction called in-line hydrolysis. Predicting the degradation of an RNA molecule is a key task in designing more stable RNA-based therapeutics. Here, we describe a crowdsourced machine learning competition (‘Stanford OpenVaccine’) on Kaggle, involving single-nucleotide resolution measurements on 6,043 diverse 102–130-nucleotide RNA constructs that were themselves solicited through crowdsourcing on the RNA design platform Eterna. The entire experiment was completed in less than 6 months, and 41% of nucleotide-level predictions from the winning model were within experimental error of the ground truth measurement. Furthermore, these models generalized to blindly predicting orthogonal degradation data on much longer mRNA molecules (504–1,588 nucleotides) with improved accuracy compared with previously published models. These results indicate that such models can represent in-line hydrolysis with excellent accuracy, supporting their use for designing stabilized messenger RNAs. The integration of two crowdsourcing platforms, one for dataset creation and another for machine learning, may be fruitful for other urgent problems that demand scientific discovery on rapid timescales

PubMed Central

University of Dundee Online Publications