125 research outputs found

    SimulFold: Simultaneously Inferring RNA Structures Including Pseudoknots, Alignments, and Trees Using a Bayesian MCMC Framework

    Get PDF
    Computational methods for predicting evolutionarily conserved rather than thermodynamic RNA structures have recently attracted increased interest. These methods are indispensable not only for elucidating the regulatory roles of known RNA transcripts, but also for predicting RNA genes. It has been notoriously difficult to devise them to make the best use of the available data and to predict high-quality RNA structures that may also contain pseudoknots. We introduce a novel theoretical framework for co-estimating an RNA secondary structure including pseudoknots, a multiple sequence alignment, and an evolutionary tree, given several RNA input sequences. We also present an implementation of the framework in a new computer program, called SimulFold, which employs a Bayesian Markov chain Monte Carlo method to sample from the joint posterior distribution of RNA structures, alignments, and trees. We use the new framework to predict RNA structures, and comprehensively evaluate the quality of our predictions by comparing our results to those of several other programs. We also present preliminary data that show SimulFold's potential as an alignment and phylogeny prediction method. SimulFold overcomes many conceptual limitations that current RNA structure prediction methods face, introduces several new theoretical techniques, and generates high-quality predictions of conserved RNA structures that may include pseudoknots. It is thus likely to have a strong impact, both on the field of RNA structure prediction and on a wide range of data analyses

    Predicting RNA pseudoknot folding thermodynamics

    Get PDF
    Based on the experimentally determined atomic coordinates for RNA helices and the self-avoiding walks of the P (phosphate) and C(4) (carbon) atoms in the diamond lattice for the polynucleotide loop conformations, we derive a set of conformational entropy parameters for RNA pseudoknots. Based on the entropy parameters, we develop a folding thermodynamics model that enables us to compute the sequence-specific RNA pseudoknot folding free energy landscape and thermodynamics. The model is validated through extensive experimental tests both for the native structures and for the folding thermodynamics. The model predicts strong sequence-dependent helix-loop competitions in the pseudoknot stability and the resultant conformational switches between different hairpin and pseudoknot structures. For instance, for the pseudoknot domain of human telomerase RNA, a native-like and a misfolded hairpin intermediates are found to coexist on the (equilibrium) folding pathways, and the interplay between the stabilities of these intermediates causes the conformational switch that may underlie a human telomerase disease

    Lynx: A Programmatic SAT Solver for the RNA-folding Problem

    Get PDF
    15th International Conference, Trento, Italy, June 17-20, 2012. ProceedingsThis paper introduces Lynx, an incremental programmatic SAT solver that allows non-expert users to introduce domain-specific code into modern conflict-driven clause-learning (CDCL) SAT solvers, thus enabling users to guide the behavior of the solver. The key idea of Lynx is a callback interface that enables non-expert users to specialize the SAT solver to a class of Boolean instances. The user writes specialized code for a class of Boolean formulas, which is periodically called by Lynx’s search routine in its inner loop through the callback interface. The user-provided code is allowed to examine partial solutions generated by the solver during its search, and to respond by adding CNF clauses back to the solver dynamically and incrementally. Thus, the user-provided code can specialize and influence the solver’s search in a highly targeted fashion. While the power of incremental SAT solvers has been amply demonstrated in the SAT literature and in the context of DPLL(T), it has not been previously made available as a programmatic API that is easy to use for non-expert users. Lynx’s callback interface is a simple yet very effective strategy that addresses this need. We demonstrate the benefits of Lynx through a case-study from computational biology, namely, the RNA secondary structure prediction problem. The constraints that make up this problem fall into two categories: structural constraints, which describe properties of the biological structure of the solution, and energetic constraints, which encode quantitative requirements that the solution must satisfy. We show that by introducing structural constraints on-demand through user provided code we can achieve, in comparison with standard SAT approaches, upto 30x reduction in memory usage and upto 100x reduction in time

    SparseRNAFolD: Sparse RNA Pseudoknot-Free Folding Including Dangles

    Get PDF

    Computational analysis of noncoding RNAs

    Get PDF
    Noncoding RNAs have emerged as important key players in the cell. Understanding their surprisingly diverse range of functions is challenging for experimental and computational biology. Here, we review computational methods to analyze noncoding RNAs. The topics covered include basic and advanced techniques to predict RNA structures, annotation of noncoding RNAs in genomic data, mining RNA-seq data for novel transcripts and prediction of transcript structures, computational aspects of microRNAs, and database resources.Austrian Science Fund (Schrodinger Fellowship J2966-B12)German Research Foundation (grant WI 3628/1-1 to SW)National Institutes of Health (U.S.) (NIH award 1RC1CA147187
    • …
    corecore