11 research outputs found

    Efficient Algorithms for Probing the RNA Mutation Landscape

    Get PDF
    The diversity and importance of the role played by RNAs in the regulation and development of the cell are now well-known and well-documented. This broad range of functions is achieved through specific structures that have been (presumably) optimized through evolution. State-of-the-art methods, such as McCaskill's algorithm, use a statistical mechanics framework based on the computation of the partition function over the canonical ensemble of all possible secondary structures on a given sequence. Although secondary structure predictions from thermodynamics-based algorithms are not as accurate as methods employing comparative genomics, the former methods are the only available tools to investigate novel RNAs, such as the many RNAs of unknown function recently reported by the ENCODE consortium. In this paper, we generalize the McCaskill partition function algorithm to sum over the grand canonical ensemble of all secondary structures of all mutants of the given sequence. Specifically, our new program, RNAmutants, simultaneously computes for each integer k the minimum free energy structure MFE(k) and the partition function Z(k) over all secondary structures of all k-point mutants, even allowing the user to specify certain positions required not to mutate and certain positions required to base-pair or remain unpaired. This technically important extension allows us to study the resilience of an RNA molecule to pointwise mutations. By computing the mutation profile of a sequence, a novel graphical representation of the mutational tendency of nucleotide positions, we analyze the deleterious nature of mutating specific nucleotide positions or groups of positions. We have successfully applied RNAmutants to investigate deleterious mutations (mutations that radically modify the secondary structure) in the Hepatitis C virus cis-acting replication element and to evaluate the evolutionary pressure applied on different regions of the HIV trans-activation response element. In particular, we show qualitative agreement between published Hepatitis C and HIV experimental mutagenesis studies and our analysis of deleterious mutations using RNAmutants. Our work also predicts other deleterious mutations, which could be verified experimentally. Finally, we provide evidence that the 3′ UTR of the GB RNA virus C has been optimized to preserve evolutionarily conserved stem regions from a deleterious effect of pointwise mutations. We hope that there will be long-term potential applications of RNAmutants in de novo RNA design and drug design against RNA viruses. This work also suggests potential applications for large-scale exploration of the RNA sequence-structure network. Binary distributions are available at http://RNAmutants.csail.mit.edu/

    Sampled ensemble neutrality as a feature to classify potential structured RNAs

    Get PDF

    A Combinatorial Framework for Designing (Pseudoknotted) RNA Algorithms

    Get PDF
    We extend an hypergraph representation, introduced by Finkelstein and Roytberg, to unify dynamic programming algorithms in the context of RNA folding with pseudoknots. Classic applications of RNA dynamic programming energy minimization, partition function, base-pair probabilities...) are reformulated within this framework, giving rise to very simple algorithms. This reformulation allows one to conceptually detach the conformation space/energy model -- captured by the hypergraph model -- from the specific application, assuming unambiguity of the decomposition. To ensure the latter property, we propose a new combinatorial methodology based on generating functions. We extend the set of generic applications by proposing an exact algorithm for extracting generalized moments in weighted distribution, generalizing a prior contribution by Miklos and al. Finally, we illustrate our full-fledged programme on three exemplary conformation spaces (secondary structures, Akutsu's simple type pseudoknots and kissing hairpins). This readily gives sets of algorithms that are either novel or have complexity comparable to classic implementations for minimization and Boltzmann ensemble applications of dynamic programming

    A method for probing the mutational landscape of amyloid structure

    Get PDF
    Motivation: Proteins of all kinds can self-assemble into highly ordered β-sheet aggregates known as amyloid fibrils, important both biologically and clinically. However, the specific molecular structure of a fibril can vary dramatically depending on sequence and environmental conditions, and mutations can drastically alter amyloid function and pathogenicity. Experimental structure determination has proven extremely difficult with only a handful of NMR-based models proposed, suggesting a need for computational methods. Results: We present AmyloidMutants, a statistical mechanics approach for de novo prediction and analysis of wild-type and mutant amyloid structures. Based on the premise of protein mutational landscapes, AmyloidMutants energetically quantifies the effects of sequence mutation on fibril conformation and stability. Tested on non-mutant, full-length amyloid structures with known chemical shift data, AmyloidMutants offers roughly 2-fold improvement in prediction accuracy over existing tools. Moreover, AmyloidMutants is the only method to predict complete super-secondary structures, enabling accurate discrimination of topologically dissimilar amyloid conformations that correspond to the same sequence locations. Applied to mutant prediction, AmyloidMutants identifies a global conformational switch between Aβ and its highly-toxic ‘Iowa’ mutant in agreement with a recent experimental model based on partial chemical shift data. Predictions on mutant, yeast-toxic strains of HET-s suggest similar alternate folds. When applied to HET-s and a HET-s mutant with core asparagines replaced by glutamines (both highly amyloidogenic chemically similar residues abundant in many amyloids), AmyloidMutants surprisingly predicts a greatly reduced capacity of the glutamine mutant to form amyloid. We confirm this finding by conducting mutagenesis experiments.National Institutes of Health (U.S.) (grant 1R01GM081871)National Institutes of Health (U.S.) (grant GM25874

    Parametric RNA Partition Function Algorithms

    Get PDF
    Thesis advisor: Peter CloteIn addition to the well-characterized messenger RNA, transfer RNA and ribosomal RNA, many new classes of noncoding RNA(ncRNA) have been discovered in the past few years. ncRNA has been shown to play important roles in multiple regulation and development processes. The increasing needs for RNA structural analysis software provide great opportunities on computational biologists. In this thesis I present three highly non-trivial RNA parametric structural analysis algorithms: 1) RNAhairpin and RNAmultiloop, which calculate parition functions with respect to hairpin number, multiloop number and multiloop order, 2) RNAshapeEval, which is based upon partition function calculation with respect to a fixed abstract shape, and 3) RNAprofileZ, which calculates the expected partition function and ensemble free energy given an RNA position weight matrix.I also describe the application of these software in biological problems, including evaluating purine riboswitch aptamer full alignment sequences to adopt their consensus shape, building hairpin and multiloop profiles for certain Rfam families, tRNA and pseudoknotted RNA secondary structure predictions. These algorithms hold the promise to be useful in a broad range of biological problems such as structural motifs search, ncRNA gene finders, canonical and pseudoknotted secondary structure predictions.Thesis (MS) — Boston College, 2010.Submitted to: Boston College. Graduate School of Arts and Sciences.Discipline: Biology

    The functional effects of RNA structure: from riboSNitches to translational control

    Get PDF
    Ribonucleic Acid (RNA) is a nucleotide polymer that, like Deoxyribonuclic acid (DNA), has an essential role in in the cell. RNA molecules are structurally distinct from DNA in the diversity of structures they adopt, due to stable intramolecular interactions. There exist a few well-defined cases of functional RNA structures, but many classes of RNA, including mRNA, adopt more flexible structures that are poorly characterized and unlinked to biological function. Accurate structure determination is thus essential to the study of RNAs. Much work involving RNA structure relies on computational prediction of RNA secondary structures. Here I briefly summarize RNA structure prediction and how genetic variation can alter RNA structure. By benchmarking algorithms that detect single nucleotide variant-induced structural change, we show that considering the full set of structures an RNA may adopt is crucial for the most accurate predictions. Conversely, constraining computational prediction with experimental structure probing data has been shown to greatly improve single-structure predictions. Thus incorporating structure probing data like 2'-hydroxyl acylation analyzed by primer extension mutational profiling (SHAPE-MaP) is an alternative approach for modeling structural features of RNAs. In order to explore structure-function relationships in a model RNA we gathered SHAPE-MaP structural data on a set of mRNAs derived from the human gene SERPINA1. Here I discuss the effect structure may have on mRNAs, especially during protein translation. We show with SHAPE-MaP constrained structure prediction that RNA structure has a role in determining SERPINA1 protein translation efficiency and that this effect can be quantitatively modeled.Doctor of Philosoph

    Identification of functional RNA structures in sequence data

    Get PDF
    Thesis advisor: Michelle M. MeyerThesis advisor: Peter CloteStructured RNAs have many biological functions ranging from catalysis of chemical reactions to gene regulation. Many of these homologous structured RNAs display most of their conservation at the secondary or tertiary structure level. As a result, strategies for natural structured RNA discovery rely heavily on identification of sequences sharing a common stable secondary structure. However, correctly identifying the functional elements of the structure continues to be challenging. In addition to studying natural RNAs, we improve our ability to distinguish functional elements by studying sequences derived from in vitro selection experiments to select structured RNAs that bind specific proteins. In this thesis, we seek to improve methods for distinguishing functional RNA structures from arbitrarily predicted structures in sequencing data. To do so, we developed novel algorithms that prioritize the structural properties of the RNA that are under selection. In order to identify natural structured ncRNAs, we bring concepts from evolutionary biology to bear on the de novo RNA discovery process. Since there is selective pressure to maintain the structure, we apply molecular evolution concepts such as neutrality to identify functional RNA structures. We hypothesize that alignments corresponding to structured RNAs should consist of neutral sequences. During the course of this work, we developed a novel measure of neutrality, the structure ensemble neutrality (SEN), which calculates neutrality by averaging the magnitude of structure retained over all single point mutations to a given sequence. In order to analyze in vitro selection data for RNA-protein binding motifs, we developed a novel framework that identifies enriched substructures in the sequence pool. Our method accounts for both sequence and structure components by abstracting the overall secondary structure into smaller substructures composed of a single base-pair stack. Unlike many current tools, our algorithm is designed to deal with the large data sets coming from high-throughput sequencing. In conclusion, our algorithms have similar performance to existing programs. However, unlike previous methods, our algorithms are designed to leverage the evolutionary selective pressures in order to emphasize functional structure conservation.Thesis (PhD) — Boston College, 2016.Submitted to: Boston College. Graduate School of Arts and Sciences.Discipline: Biology

    Une signature du polymorphisme structural d’acides ribonucléiques non-codants permettant de comparer leurs niveaux d’activités biochimiques

    Get PDF
    Des évidences expérimentales récentes indiquent que les ARN changent de structures au fil du temps, parfois très rapidement, et que ces changements sont nécessaires à leurs activités biochimiques. La structure de ces ARN est donc dynamique. Ces mêmes évidences notent également que les structures clés impliquées sont prédites par le logiciel de prédiction de structure secondaire MC-Fold. En comparant les prédictions de structures du logiciel MC-Fold, nous avons constaté un lien clair entre les structures presque optimales (en termes de stabilité prédites par ce logiciel) et les variations d’activités biochimiques conséquentes à des changements ponctuels dans la séquence. Nous avons comparé les séquences d’ARN du point de vue de leurs structures dynamiques afin d’investiguer la similarité de leurs fonctions biologiques. Ceci a nécessité une accélération notable du logiciel MC-Fold. L’approche algorithmique est décrite au chapitre 1. Au chapitre 2 nous classons les impacts de légères variations de séquences des microARN sur la fonction naturelle de ceux-ci. Au chapitre 3 nous identifions des fenêtres dans de longs ARN dont les structures dynamiques occupent possiblement des rôles dans les désordres du spectre autistique et dans la polarisation des œufs de certains batraciens (Xenopus spp.).Recent experimental evidence indicates that RNA structure changes, sometimes very rapidly and that these changes are both required for biochemical activity and captured by the secondary structure prediction software MC-Fold. RNA structure is thus dynamic. We compared RNA sequences from the point of view of their structural dynamics so as to investigate how similar their biochemical activities were by computing a signature from the output of the structure prediction software MC-Fold. This required us to accelerate considerably the software MC-Fold. The algorithmic approach to this acceleration is described in chapter 1. In chapter 2, point mutations that disrupt the biochemical activity of microRNA are explained in terms of changes in RNA dynamics. Finally, in chapter 3 we identify dynamic structure windows in long RNA with potentially significant roles in autism spectrum disorders and separately in Xenopus ssp. (species of frogs) egg polarisation
    corecore