2 research outputs found

    Trends and challenges in Computational RNA biology

    Get PDF
    A report on the Wellcome Trust Conference on Computational RNA Biology, held in Hinxton, UK, on 17–19 October 2016

    Using Machine Learning to Better Predict the Structure of RNA and RNA Containing Complexes

    Full text link
    Determining the structure of RNA in the presence of drug like molecules is a crucial step in any drug development campaign. Standard experimental approaches are expensive and time-consuming, and current state-of-the-art computational methods are too inaccurate to be useful. In principle, computer docking can be used to pre- dict the 3D structure of RNA-ligand complexes. However the scoring functions which are accompanied by the available docking programs for pose ranking of RNA-ligand complexes miss-classify native like poses among a set of decoy poses. As such, there is a need for the development of fast, easy, and precise prediction methods for determining the 3D structure of RNAs. In theory, nuclear magnetic resonance (NMR) spectroscopy derived chemical shifts contain information about the local chemical environment at each site in a molecule and so can be a source of rich structural in- formation. In this work, the goal is to predict the structure of RNA-ligand complexes using NMR chemical shifts. To that end, we explore the effect of different machine learning algorithms and ring current models to accurately predict the chemical shifts for standard RNA-ligand complexes. Extra-Randomized trees machine learning algorithms and Pople ring current model were found to be the most accurate ones at predicting the chemical shifts of RNA-ligand complexes. Next we explored the use of chemical shifts to guide the 3D structure prediction of RNA-ligand complexes starting from RNA sequence. We applied CS-Fold, an in-house method which utilizes chemical shifts to guide the secondary structure prediction of RNAs. From the best predicted secondary structures using CS-Fold, we generated de novo 3D models of RNAs using the Fragment Assembly of RNA with Full Atom Refinement (FARFAR) approach. We used chemical shifts predicted by LarmorD to refine those 3D structures. We found that CS-Fold (the CS-guided secondary structure prediction approach) combined with Rosetta de novo protocol for 3D motifs prediction significantly enhanced the recovery rates to 50% compared to 20% obtained by the RNAStructure and Rosetta combination. Next we used rDock to dock the ligand from the 10 best predicted 3D structures of the RNA and filter the poses based on the chemical shift errors. This study motivated us to build ma- chine learning models based on a molecular fingerprinting approach that can recover native-like RNA-ligand structures from non-native ones in a decoy set as described below. Next, we describe RNAPoser, a computational tool that estimate the relative “nativeness” of a set of RNA-ligand poses using machine learning pose classifiers. We trained our pose classifiers on molecular “fingerprints” that were a fusion of atomic fingerprints. These fingerprints encode the local “RNA environment” around ligand atoms. Using the classification scores from our RNAPoser classifiers and ranking the poses based on those scores, we found that the recovery of native like poses is significantly better than those obtained from just using the raw rdock docking scores. We also performed a leave-one-out validation approach and found that RNAPoser could recover ∌80% of the poses that were within 2.5 A of the native poses, in 88 RNA-ligand complexes we explored. Likewise, on a validation set of 17 complexes, we could recover poses in ∌70% of the complexes. RNAPosers could be used as a tool to help in RNA-ligand pose prediction and hence we make it available to the academic community via https://github.com/atfrank/RNAPosers.PHDChemistryUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/155127/1/itssahil_1.pd
    corecore