9 research outputs found
An Efficient Algorithm for Upper Bound on the Partition Function of Nucleic Acids
It has been shown that minimum free energy structure for RNAs and RNA-RNA
interaction is often incorrect due to inaccuracies in the energy parameters and
inherent limitations of the energy model. In contrast, ensemble based
quantities such as melting temperature and equilibrium concentrations can be
more reliably predicted. Even structure prediction by sampling from the
ensemble and clustering those structures by Sfold [7] has proven to be more
reliable than minimum free energy structure prediction. The main obstacle for
ensemble based approaches is the computational complexity of the partition
function and base pairing probabilities. For instance, the space complexity of
the partition function for RNA-RNA interaction is and the time
complexity is which are prohibitively large [4,12]. Our goal in this
paper is to give a fast algorithm, based on sparse folding, to calculate an
upper bound on the partition function. Our work is based on the recent
algorithm of Hazan and Jaakkola [10]. The space complexity of our algorithm is
the same as that of sparse folding algorithms, and the time complexity of our
algorithm is for single RNA and for RNA-RNA
interaction in practice, in which is the running time of sparse folding
and () is a sequence dependent parameter
Exact Learning of RNA Energy Parameters From Structure
We consider the problem of exact learning of parameters of a linear RNA
energy model from secondary structure data. A necessary and sufficient
condition for learnability of parameters is derived, which is based on
computing the convex hull of union of translated Newton polytopes of input
sequences. The set of learned energy parameters is characterized as the convex
cone generated by the normal vectors to those facets of the resulting polytope
that are incident to the origin. In practice, the sufficient condition may not
be satisfied by the entire training data set; hence, computing a maximal subset
of training data for which the sufficient condition is satisfied is often
desired. We show that problem is NP-hard in general for an arbitrary
dimensional feature space. Using a randomized greedy algorithm, we select a
subset of RNA STRAND v2.0 database that satisfies the sufficient condition for
separate A-U, C-G, G-U base pair counting model. The set of learned energy
parameters includes experimentally measured energies of A-U, C-G, and G-U
pairs; hence, our parameter set is in agreement with the Turner parameters
Towards 3D structure prediction of large RNA molecules: an integer programming framework to insert local 3D motifs in RNA secondary structure
Motivation: The prediction of RNA 3D structures from its sequence only is a milestone to RNA function analysis and prediction. In recent years, many methods addressed this challenge, ranging from cycle decomposition and fragment assembly to molecular dynamics simulations. However, their predictions remain fragile and limited to small RNAs. To expand the range and accuracy of these techniques, we need to develop algorithms that will enable to use all the structural information available. In particular, the energetic contribution of secondary structure interactions is now well documented, but the quantification of non-canonical interactionsâthose shaping the tertiary structureâis poorly understood. Nonetheless, even if a complete RNA tertiary structure energy model is currently unavailable, we now have catalogues of local 3D structural motifs including non-canonical base pairings. A practical objective is thus to develop techniques enabling us to use this knowledge for robust RNA tertiary structure predictors
Computational analysis of noncoding RNAs
Noncoding RNAs have emerged as important key players in the cell. Understanding their surprisingly diverse range of functions is challenging for experimental and computational biology. Here, we review computational methods to analyze noncoding RNAs. The topics covered include basic and advanced techniques to predict RNA structures, annotation of noncoding RNAs in genomic data, mining RNA-seq data for novel transcripts and prediction of transcript structures, computational aspects of microRNAs, and database resources.Austrian Science Fund (Schrodinger Fellowship J2966-B12)German Research Foundation (grant WI 3628/1-1 to SW)National Institutes of Health (U.S.) (NIH award 1RC1CA147187
ViennaRNA Package 2.0
<p>Abstract</p> <p>Background</p> <p>Secondary structure forms an important intermediate level of description of nucleic acids that encapsulates the dominating part of the folding energy, is often well conserved in evolution, and is routinely used as a basis to explain experimental findings. Based on carefully measured thermodynamic parameters, exact dynamic programming algorithms can be used to compute ground states, base pairing probabilities, as well as thermodynamic properties.</p> <p>Results</p> <p>The <monospace>ViennaRNA</monospace> Package has been a widely used compilation of RNA secondary structure related computer programs for nearly two decades. Major changes in the structure of the standard energy model, the <it>Turner 2004 </it>parameters, the pervasive use of multi-core CPUs, and an increasing number of algorithmic variants prompted a major technical overhaul of both the underlying <monospace>RNAlib</monospace> and the interactive user programs. New features include an expanded repertoire of tools to assess RNA-RNA interactions and restricted ensembles of structures, additional output information such as <it>centroid </it>structures and <it>maximum expected accuracy </it>structures derived from base pairing probabilities, or <it>z</it>-<it>scores </it>for locally stable secondary structures, and support for input in <monospace>fasta</monospace> format. Updates were implemented without compromising the computational efficiency of the core algorithms and ensuring compatibility with earlier versions.</p> <p>Conclusions</p> <p>The <monospace>ViennaRNA Package 2.0</monospace>, supporting concurrent computations <monospace>via OpenMP</monospace>, can be downloaded from <url>http://www.tbi.univie.ac.at/RNA</url>.</p
14 - Estructura, plegamiento y evoluciĂłn de RNA
Chapter 14 of the book "BioinformĂĄtica con Ă" a project aiming to provide specialized educational bibliography on Bioinformatics for Spanish speakers. The result consists on more than 500 pages where the following matters are covered: biomedical databases, sequence analysis, phylogeny and evolution, structural biology, including diverse topics such as docking, virtual screening or molecular dynamics, statistics and R, systems biology, programming skills, data mining, parallel computation, bibliography management and science article writing
Une signature du polymorphisme structural dâacides ribonuclĂ©iques non-codants permettant de comparer leurs niveaux dâactivitĂ©s biochimiques
Des Ă©vidences expĂ©rimentales rĂ©centes indiquent que les ARN changent de structures au fil du temps, parfois trĂšs rapidement, et que ces changements sont nĂ©cessaires Ă leurs activitĂ©s biochimiques. La structure de ces ARN est donc dynamique. Ces mĂȘmes Ă©vidences notent Ă©galement que les structures clĂ©s impliquĂ©es sont prĂ©dites par le logiciel de prĂ©diction de structure secondaire MC-Fold.
En comparant les prĂ©dictions de structures du logiciel MC-Fold, nous avons constatĂ© un lien clair entre les structures presque optimales (en termes de stabilitĂ© prĂ©dites par ce logiciel) et les variations dâactivitĂ©s biochimiques consĂ©quentes Ă des changements ponctuels dans la sĂ©quence.
Nous avons comparĂ© les sĂ©quences dâARN du point de vue de leurs structures dynamiques afin dâinvestiguer la similaritĂ© de leurs fonctions biologiques. Ceci a nĂ©cessitĂ© une accĂ©lĂ©ration notable du logiciel MC-Fold. Lâapproche algorithmique est dĂ©crite au chapitre 1. Au chapitre 2 nous classons les impacts de lĂ©gĂšres variations de sĂ©quences des microARN sur la fonction naturelle de ceux-ci. Au chapitre 3 nous identifions des fenĂȘtres dans de longs ARN dont les structures dynamiques occupent possiblement des rĂŽles dans les dĂ©sordres du spectre autistique et dans la polarisation des Ćufs de certains batraciens (Xenopus spp.).Recent experimental evidence indicates that RNA structure changes, sometimes very rapidly and that these changes are both required for biochemical activity and captured by the secondary structure prediction software MC-Fold. RNA structure is thus dynamic.
We compared RNA sequences from the point of view of their structural dynamics so as to investigate how similar their biochemical activities were by computing a signature from the output of the structure prediction software MC-Fold.
This required us to accelerate considerably the software MC-Fold. The algorithmic approach to this acceleration is described in chapter 1. In chapter 2, point mutations that disrupt the biochemical activity of microRNA are explained in terms of changes in RNA dynamics. Finally, in chapter 3 we identify dynamic structure windows in long RNA with potentially significant roles in autism spectrum disorders and separately in Xenopus ssp. (species of frogs) egg polarisation
A folding algorithm for extended RNA secondary structures
Motivation: RNA secondary structure contains many non-canonical base pairs of different pair families. Successful prediction of these structural features leads to improved secondary structures with applications in tertiary structure prediction and simultaneous folding and alignment