9 research outputs found

    An Efficient Algorithm for Upper Bound on the Partition Function of Nucleic Acids

    Full text link
    It has been shown that minimum free energy structure for RNAs and RNA-RNA interaction is often incorrect due to inaccuracies in the energy parameters and inherent limitations of the energy model. In contrast, ensemble based quantities such as melting temperature and equilibrium concentrations can be more reliably predicted. Even structure prediction by sampling from the ensemble and clustering those structures by Sfold [7] has proven to be more reliable than minimum free energy structure prediction. The main obstacle for ensemble based approaches is the computational complexity of the partition function and base pairing probabilities. For instance, the space complexity of the partition function for RNA-RNA interaction is O(n4)O(n^4) and the time complexity is O(n6)O(n^6) which are prohibitively large [4,12]. Our goal in this paper is to give a fast algorithm, based on sparse folding, to calculate an upper bound on the partition function. Our work is based on the recent algorithm of Hazan and Jaakkola [10]. The space complexity of our algorithm is the same as that of sparse folding algorithms, and the time complexity of our algorithm is O(MFE(n)ℓ)O(MFE(n)\ell) for single RNA and O(MFE(m,n)ℓ)O(MFE(m, n)\ell) for RNA-RNA interaction in practice, in which MFEMFE is the running time of sparse folding and ℓ≀n\ell \leq n (ℓ≀n+m\ell \leq n + m) is a sequence dependent parameter

    Exact Learning of RNA Energy Parameters From Structure

    Full text link
    We consider the problem of exact learning of parameters of a linear RNA energy model from secondary structure data. A necessary and sufficient condition for learnability of parameters is derived, which is based on computing the convex hull of union of translated Newton polytopes of input sequences. The set of learned energy parameters is characterized as the convex cone generated by the normal vectors to those facets of the resulting polytope that are incident to the origin. In practice, the sufficient condition may not be satisfied by the entire training data set; hence, computing a maximal subset of training data for which the sufficient condition is satisfied is often desired. We show that problem is NP-hard in general for an arbitrary dimensional feature space. Using a randomized greedy algorithm, we select a subset of RNA STRAND v2.0 database that satisfies the sufficient condition for separate A-U, C-G, G-U base pair counting model. The set of learned energy parameters includes experimentally measured energies of A-U, C-G, and G-U pairs; hence, our parameter set is in agreement with the Turner parameters

    Towards 3D structure prediction of large RNA molecules: an integer programming framework to insert local 3D motifs in RNA secondary structure

    Get PDF
    Motivation: The prediction of RNA 3D structures from its sequence only is a milestone to RNA function analysis and prediction. In recent years, many methods addressed this challenge, ranging from cycle decomposition and fragment assembly to molecular dynamics simulations. However, their predictions remain fragile and limited to small RNAs. To expand the range and accuracy of these techniques, we need to develop algorithms that will enable to use all the structural information available. In particular, the energetic contribution of secondary structure interactions is now well documented, but the quantification of non-canonical interactions—those shaping the tertiary structure—is poorly understood. Nonetheless, even if a complete RNA tertiary structure energy model is currently unavailable, we now have catalogues of local 3D structural motifs including non-canonical base pairings. A practical objective is thus to develop techniques enabling us to use this knowledge for robust RNA tertiary structure predictors

    Computational analysis of noncoding RNAs

    Get PDF
    Noncoding RNAs have emerged as important key players in the cell. Understanding their surprisingly diverse range of functions is challenging for experimental and computational biology. Here, we review computational methods to analyze noncoding RNAs. The topics covered include basic and advanced techniques to predict RNA structures, annotation of noncoding RNAs in genomic data, mining RNA-seq data for novel transcripts and prediction of transcript structures, computational aspects of microRNAs, and database resources.Austrian Science Fund (Schrodinger Fellowship J2966-B12)German Research Foundation (grant WI 3628/1-1 to SW)National Institutes of Health (U.S.) (NIH award 1RC1CA147187

    ViennaRNA Package 2.0

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Secondary structure forms an important intermediate level of description of nucleic acids that encapsulates the dominating part of the folding energy, is often well conserved in evolution, and is routinely used as a basis to explain experimental findings. Based on carefully measured thermodynamic parameters, exact dynamic programming algorithms can be used to compute ground states, base pairing probabilities, as well as thermodynamic properties.</p> <p>Results</p> <p>The <monospace>ViennaRNA</monospace> Package has been a widely used compilation of RNA secondary structure related computer programs for nearly two decades. Major changes in the structure of the standard energy model, the <it>Turner 2004 </it>parameters, the pervasive use of multi-core CPUs, and an increasing number of algorithmic variants prompted a major technical overhaul of both the underlying <monospace>RNAlib</monospace> and the interactive user programs. New features include an expanded repertoire of tools to assess RNA-RNA interactions and restricted ensembles of structures, additional output information such as <it>centroid </it>structures and <it>maximum expected accuracy </it>structures derived from base pairing probabilities, or <it>z</it>-<it>scores </it>for locally stable secondary structures, and support for input in <monospace>fasta</monospace> format. Updates were implemented without compromising the computational efficiency of the core algorithms and ensuring compatibility with earlier versions.</p> <p>Conclusions</p> <p>The <monospace>ViennaRNA Package 2.0</monospace>, supporting concurrent computations <monospace>via OpenMP</monospace>, can be downloaded from <url>http://www.tbi.univie.ac.at/RNA</url>.</p

    14 - Estructura, plegamiento y evoluciĂłn de RNA

    Get PDF
    Chapter 14 of the book "Bioinformática con Ñ" a project aiming to provide specialized educational bibliography on Bioinformatics for Spanish speakers. The result consists on more than 500 pages where the following matters are covered: biomedical databases, sequence analysis, phylogeny and evolution, structural biology, including diverse topics such as docking, virtual screening or molecular dynamics, statistics and R, systems biology, programming skills, data mining, parallel computation, bibliography management and science article writing

    Une signature du polymorphisme structural d’acides ribonuclĂ©iques non-codants permettant de comparer leurs niveaux d’activitĂ©s biochimiques

    Get PDF
    Des Ă©vidences expĂ©rimentales rĂ©centes indiquent que les ARN changent de structures au fil du temps, parfois trĂšs rapidement, et que ces changements sont nĂ©cessaires Ă  leurs activitĂ©s biochimiques. La structure de ces ARN est donc dynamique. Ces mĂȘmes Ă©vidences notent Ă©galement que les structures clĂ©s impliquĂ©es sont prĂ©dites par le logiciel de prĂ©diction de structure secondaire MC-Fold. En comparant les prĂ©dictions de structures du logiciel MC-Fold, nous avons constatĂ© un lien clair entre les structures presque optimales (en termes de stabilitĂ© prĂ©dites par ce logiciel) et les variations d’activitĂ©s biochimiques consĂ©quentes Ă  des changements ponctuels dans la sĂ©quence. Nous avons comparĂ© les sĂ©quences d’ARN du point de vue de leurs structures dynamiques afin d’investiguer la similaritĂ© de leurs fonctions biologiques. Ceci a nĂ©cessitĂ© une accĂ©lĂ©ration notable du logiciel MC-Fold. L’approche algorithmique est dĂ©crite au chapitre 1. Au chapitre 2 nous classons les impacts de lĂ©gĂšres variations de sĂ©quences des microARN sur la fonction naturelle de ceux-ci. Au chapitre 3 nous identifions des fenĂȘtres dans de longs ARN dont les structures dynamiques occupent possiblement des rĂŽles dans les dĂ©sordres du spectre autistique et dans la polarisation des Ɠufs de certains batraciens (Xenopus spp.).Recent experimental evidence indicates that RNA structure changes, sometimes very rapidly and that these changes are both required for biochemical activity and captured by the secondary structure prediction software MC-Fold. RNA structure is thus dynamic. We compared RNA sequences from the point of view of their structural dynamics so as to investigate how similar their biochemical activities were by computing a signature from the output of the structure prediction software MC-Fold. This required us to accelerate considerably the software MC-Fold. The algorithmic approach to this acceleration is described in chapter 1. In chapter 2, point mutations that disrupt the biochemical activity of microRNA are explained in terms of changes in RNA dynamics. Finally, in chapter 3 we identify dynamic structure windows in long RNA with potentially significant roles in autism spectrum disorders and separately in Xenopus ssp. (species of frogs) egg polarisation

    A folding algorithm for extended RNA secondary structures

    No full text
    Motivation: RNA secondary structure contains many non-canonical base pairs of different pair families. Successful prediction of these structural features leads to improved secondary structures with applications in tertiary structure prediction and simultaneous folding and alignment
    corecore