Search CORE

Expected degree for RNA secondary structure networks

Author: Clote Peter
Publication venue
Publication date: 01/11/2014
Field of study

Consider the network of all secondary structures of a given RNA sequence, where nodes are connected when the corresponding structures have base pair distance one. The expected degree of the network is the average number of neighbors, where average may be computed with respect to the either the uniform or Boltzmann probability. Here we describe the first algorithm, RNAexpNumNbors, that can compute the expected number of neighbors, or expected network degree, of an input sequence. For RNA sequences from the Rfam database, the expected degree is significantly less than the CMFE structure, defined to have minimum free energy over all structures consistent with the Rfam consensus structure. The expected degree of structural RNAs, such as purine riboswitches, paradoxically appears to be smaller than that of random RNA, yet the difference between the degree of the MFE structure and the expected degree is larger than that of random RNA. Expected degree does not seem to correlate with standard structural diversity measures of RNA, such as positional entropy, ensemble defect, etc. The program {\tt RNAexpNumNbors} is written in C, runs in cubic time and quadratic space, and is publicly available at http://bioinformatics.bc.edu/clotelab/RNAexpNumNbors.Comment: 25 pages, 5 figures, 5 table

CiteSeerX

Combinatorial RNA Design Designability and Structure-Approximating Algorithm in Watson-Crick and Nussinov-Jacobson Energy Models

Author: Haleš Jozef
Héliou Alice
Maňuch Ján
Ponty Yann
Stacho Ladislav
Publication venue
Publication date: 01/01/2016
Field of study

We consider the Combinatorial RNA Design problem, a minimal instance of RNA design where one must produce an RNA sequence that adopts a given secondary structure as its minimal free-energy structure. We consider two free-energy models where the contributions of base pairs are additive and independent: the purely combinatorial Watson-Crick model, which only allows equally-contributing A -- U and C -- G base pairs, and the real-valued Nussinov-Jacobson model, which associates arbitrary energies to A -- U, C -- G and G -- U base pairs. We first provide a complete characterization of designable structures using restricted alphabets and, in the four-letter alphabet, provide a complete characterization for designable structures without unpaired bases. When unpaired bases are allowed, we characterize extensive classes of (non-)designable structures, and prove the closure of the set of designable structures under the stutter operation. Membership of a given structure to any of the classes can be tested in

\Theta

(n) time, including the generation of a solution sequence for positive instances. Finally, we consider a structure-approximating relaxation of the design, and provide a

\Theta

(n) algorithm which, given a structure S that avoids two trivially non-designable motifs, transforms S into a designable structure constructively by adding at most one base-pair to each of its stems.Comment: To appea

Public Library of Science (PLOS)

Efficient Algorithms for Probing the RNA Mutation Landscape

Author: A Coventry
A Omer
A Serganov
AO Harmanci
B Baker
B Knudsen
Bonnie Berger
C Reidys
C Thurner
Consortium ENCODE Project
D Barash
D Mathews
DH Mathews
E Rivas
I Hofacker
I Hofacker
I Miklos
IL Hofacker
IM Meyer
J Waldispuhl
J Waldispuhl
JS McCaskill
JS Pedersen
JS Weinger
Jérôme Waldispühl
M Yanagi
M Yang
M Zuker
M Zuker
MC Cowperthwaite
MC Cowperthwaite
MT Cheah
NM Cuceanu
P Clote
P Schuster
P Schuster
Peter Clote
PP Gardner
R Nussinov
RA Dimitrov
RD Dowell
S Brown
S Griffiths-Jones
S Griffiths-Jones
S You
SH Bernhart
Srinivas Devadas
T Kulinski
T Xia
Uwe Ohler
V Ambros
W Fontana
W Grüner
W Shu
Y Ding
Y Ding
Y Ding
Y Ponty
Publication venue: Public Library of Science
Publication date: 08/08/2008
Field of study

The diversity and importance of the role played by RNAs in the regulation and development of the cell are now well-known and well-documented. This broad range of functions is achieved through specific structures that have been (presumably) optimized through evolution. State-of-the-art methods, such as McCaskill's algorithm, use a statistical mechanics framework based on the computation of the partition function over the canonical ensemble of all possible secondary structures on a given sequence. Although secondary structure predictions from thermodynamics-based algorithms are not as accurate as methods employing comparative genomics, the former methods are the only available tools to investigate novel RNAs, such as the many RNAs of unknown function recently reported by the ENCODE consortium. In this paper, we generalize the McCaskill partition function algorithm to sum over the grand canonical ensemble of all secondary structures of all mutants of the given sequence. Specifically, our new program, RNAmutants, simultaneously computes for each integer k the minimum free energy structure MFE(k) and the partition function Z(k) over all secondary structures of all k-point mutants, even allowing the user to specify certain positions required not to mutate and certain positions required to base-pair or remain unpaired. This technically important extension allows us to study the resilience of an RNA molecule to pointwise mutations. By computing the mutation profile of a sequence, a novel graphical representation of the mutational tendency of nucleotide positions, we analyze the deleterious nature of mutating specific nucleotide positions or groups of positions. We have successfully applied RNAmutants to investigate deleterious mutations (mutations that radically modify the secondary structure) in the Hepatitis C virus cis-acting replication element and to evaluate the evolutionary pressure applied on different regions of the HIV trans-activation response element. In particular, we show qualitative agreement between published Hepatitis C and HIV experimental mutagenesis studies and our analysis of deleterious mutations using RNAmutants. Our work also predicts other deleterious mutations, which could be verified experimentally. Finally, we provide evidence that the 3′ UTR of the GB RNA virus C has been optimized to preserve evolutionarily conserved stem regions from a deleterious effect of pointwise mutations. We hope that there will be long-term potential applications of RNAmutants in de novo RNA design and drug design against RNA viruses. This work also suggests potential applications for large-scale exploration of the RNA sequence-structure network. Binary distributions are available at http://RNAmutants.csail.mit.edu/

Directory of Open Access Journals

Efficient approximations of RNA kinetics landscape using non-redundant sampling

Author: Akutsu
Andronescu
Badelt
Baumstark
Cech
Cruz
Danilova
Denise
Ding
du Boisberranger
Flajolet
Flamm
Flamm
Flamm
Flamm
Hélène Touzet
Isambert
Juraj Michálik
Kucharik
Kushwaha
Li
Li
Lorenz
Lorenz
Lorenz
Lorenz
Mathews
Maňuch
McCaskill
Miao
Morgan
Nawrocki
Nussinov
Saffarian
Schultes
Senter
Sharova
Sheikh
Smola
Tinoco
Turner
Turner
Waldispühl
Watters
Wilkinson
Wolfinger
Wuchty
Xayaphoummine
Yann Ponty
Publication venue: 'Oxford University Press (OUP)'
Publication date: 21/07/2017
Field of study

International audienceMotivation: Kinetics is key to understand many phenomena involving RNAs, such as co-transcriptional folding and riboswitches. Exact out-of-equilibrium studies induce extreme computational demands, leading state-of-the-art methods to rely on approximated kinetics landscapes, obtained using sampling strategies that strive to generate the key landmarks of the landscape topology. However, such methods are impeded by a large level of redundancy within sampled sets. Such a redundancy is uninformative, and obfuscates important intermediate states, leading to an incomplete vision of RNA dynamics. Results: We introduce RNANR, a new set of algorithms for the exploration of RNA kinetics landscapes at the secondary structure level. RNANR considers locally optimal structures, a reduced set of RNA con-formations, in order to focus its sampling on basins in the kinetic landscape. Along with an exhaustive enumeration, RNANR implements a novel non-redundant stochastic sampling, and offers a rich array of structural parameters. Our tests on both real and random RNAs reveal that RNANR allows to generate more unique structures in a given time than its competitors, and allows a deeper exploration of kinetics landscapes. Availability: RNANR is freely available at https://project.inria.fr/rnalands/rnan

HAL Descartes

Hal-Diderot

Thermodynamics of RNA structures by Wang–Landau sampling

Author: Abrashams
Bekaert
Bernhart
Bradley
B ck
Cheah
Chen
Clote
Danilova
Dimitrov
Dirks
Eddy
F. Lou
Flamm
Flamm
Griffiths-Jones
Hofacker
Kirkpatrick
Knudsen
Kou
Lim
Lyngs
Mandal
Markham
Metzler
Nussinov
Omer
Ortiz
P. Clote
Reeder
Reinisch
REN
Rivas
Tabaska
Tucker
van Batenburg
Wang
Weinger
Wuchty
Xayaphoummine
Zhang
Zhao
Zuker
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Motivation: Thermodynamics-based dynamic programming RNA secondary structure algorithms have been of immense importance in molecular biology, where applications range from the detection of novel selenoproteins using expressed sequence tag (EST) data, to the determination of microRNA genes and their targets. Dynamic programming algorithms have been developed to compute the minimum free energy secondary structure and partition function of a given RNA sequence, the minimum free-energy and partition function for the hybridization of two RNA molecules, etc. However, the applicability of dynamic programming methods depends on disallowing certain types of interactions (pseudoknots, zig-zags, etc.), as their inclusion renders structure prediction an nondeterministic polynomial time (NP)-complete problem. Nevertheless, such interactions have been observed in X-ray structures

HAL-CentraleSupelec

HAL-Rennes 1

TT2NE: A novel algorithm to predict RNA secondary structures with pseudoknots

Author: Bailor
Bellaousov
Bon
Dam
Dirks
Doshi
Elliot
Friedman
Henri Orland
Kuo
Lyngso
Mathews
McCaskill
Metzler
Michaël Bon
Nussinov
Orland
Pillsbury
Reeder
Ren
Rivas
Rivas
Ruan
Shen
t'Hooft
Tinoco
Vernizzi
Zhao
Zuker
Zuker
Publication venue
Publication date: 21/10/2010
Field of study

We present TT2NE, a new algorithm to predict RNA secondary structures with pseudoknots. The method is based on a classification of RNA structures according to their topological genus. TT2NE guarantees to find the minimum free energy structure irrespectively of pseudoknot topology. This unique proficiency is obtained at the expense of the maximum length of sequence that can be treated but comparison with state-of-the-art algorithms shows that TT2NE is a very powerful tool within its limits. Analysis of TT2NE's wrong predictions sheds light on the need to study how sterical constraints limit the range of pseudoknotted structures that can be formed from a given sequence. An implementation of TT2NE on a public server can be found at http://ipht.cea.fr/rna/tt2ne.php

HAL-CEA

Controlled non uniform random generation of decomposable structures

Author: A. Denise
Berghen
Bertoni
Bostan
Brlek
Denise
Denise
Dershowitz
Drmota
Duchon
Dutour
Faugère
Flajolet
Flajolet
Flajolet
Flajolet
Fontana
Goldwurm
Greene
Hofacker
Hofacker
Jin
Lipshitz
M. Termier
Mathews
Mathews
Nebel
Nebel
Nicodème
Nijenhuis
Ponty
Salvy
Schönhage
van der Hoeven
Vauchaussade de Chaumont
Waterman
Y. Ponty
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

Consider a class of decomposable combinatorial structures, using different types of atoms \Atoms = \{\At_1,\ldots ,\At_{|{\Atoms}|}\}. We address the random generation of such structures with respect to a size

n

and a targeted distribution in

k

of its \emph{distinguished} atoms. We consider two variations on this problem. In the first alternative, the targeted distribution is given by

k

real numbers \TargFreq_1, \ldots, \TargFreq_k such that 0 < \TargFreq_i < 1 for all

i

and \TargFreq_1+\cdots+\TargFreq_k \leq 1. We aim to generate random structures among the whole set of structures of a given size

n

, in such a way that the {\em expected} frequency of any distinguished atom \At_i equals \TargFreq_i. We address this problem by weighting the atoms with a

k

-tuple \Weights of real-valued weights, inducing a weighted distribution over the set of structures of size

n

. We first adapt the classical recursive random generation scheme into an algorithm taking \bigO{n^{1+o(1)}+mn\log{n}} arithmetic operations to draw

m

structures from the \Weights-weighted distribution. Secondly, we address the analytical computation of weights such that the targeted frequencies are achieved asymptotically, i. e. for large values of

n

. We derive systems of functional equations whose resolution gives an explicit relationship between \Weights and \TargFreq_1, \ldots, \TargFreq_k. Lastly, we give an algorithm in \bigO{k n^4} for the inverse problem, {\it i.e.} computing the frequencies associated with a given

k

-tuple \Weights of weights, and an optimized version in \bigO{k n^2} in the case of context-free languages. This allows for a heuristic resolution of the weights/frequencies relationship suitable for complex specifications. In the second alternative, the targeted distribution is given by a

k

natural numbers

n_1, \ldots, n_k

such that

n_1+\cdots+n_k+r=n

where

r \geq 0

is the number of undistinguished atoms. The structures must be generated uniformly among the set of structures of size

n

that contain {\em exactly}

n_i

atoms \At_i (

1 \leq i \leq k

). We give a \bigO{r^2\prod_{i=1}^k n_i^2 +m n k \log n} algorithm for generating

m

structures, which simplifies into a \bigO{r\prod_{i=1}^k n_i +m n} for regular specifications

Elsevier - Publisher Connector

HAL-CentraleSupelec

HAL-Rennes 1

Multi-objective dynamic population shuffled frog-leaping biclustering of microarray data

Author: Chen Yiming
Hu Xiaohua
Li Zhoujun
Liu Feifei
Liu Junwan
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

<p>Abstract</p> <p>The structure of RiboNucleic Acid (RNA) has the potential to be altered by a Single Nucleotide Polymorphism (SNP). Disease-associated SNPs mapping to non-coding regions of the genome that are transcribed into RiboNucleic Acid (RNA) can potentially affect cellular regulation (and cause disease) by altering the structure of the transcript. We performed a large-scale meta-analysis of Selective 2'-Hydroxyl Acylation analyzed by Primer Extension (SHAPE) data, which probes the structure of RNA. We found that several single point mutations exist that significantly disrupt RNA secondary structure in the five transcripts we analyzed. Thus, every RNA that is transcribed has the potential to be a “RiboSNitch;” where a SNP causes a large conformational change that alters regulatory function. Predicting the SNPs that will have the largest effect on RNA structure remains a contemporary computational challenge. We therefore benchmarked the most popular RNA structure prediction algorithms for their ability to identify mutations that maximally affect structure. We also evaluated metrics for rank ordering the extent of the structural change. Although no single algorithm/metric combination dramatically outperformed the others, small differences in AUC (Area Under the Curve) values reveal that certain approaches do provide better agreement with experiment. The experimental data we analyzed nonetheless show that multiple single point mutations exist in all RNA transcripts that significantly disrupt structure in agreement with the predictions.</p

Springer - Publisher Connector

Directory of Open Access Journals

Carolina Digital Repository

Asymptotic structural properties of quasi-random saturated structures of RNA

Author: Clote P. (Peter)
Kranakis E. (Evangelos)
Krizanc D. (Danny)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/10/2013
Field of study

Background: RNA folding depends on the distribution of kinetic traps in the landscape of all secondary structures. Kinetic traps in the Nussinov energy model are precisely those secondary structures that are saturated, meaning that no base pair can be added without introducing either a pseudoknot or base triple. In previous work, we investigated asymptotic combinatorics of both random saturated structures and of quasi-random saturated structures, where the latter are constructed by a natural stochastic process.Results: We prove that for quasi-random saturated structures with the uniform distribution, the asymptotic expected number of external loops is O(logn) and the asymptotic expected maximum stem length is O(logn), while under the Zipf distribution, the asymptotic expected number of external loops is O(log2n) and the asymptotic expected maximum stem length is O(logn/log logn).Conclusions: Quasi-random saturated structures are generated by a stochastic greedy method, which is simple to implement. Structural features of random saturated structures appear to resemble those of quasi-random saturated structures, and the latter appear to constitute a class for which both the generation of sampled structures as well as a combinatorial investigation of structural features may be simpler to undertake

Carleton University's Institutional Repository