4,318 research outputs found

    Statistical modeling of RNA structure profiling experiments enables parsimonious reconstruction of structure landscapes.

    Get PDF
    RNA plays key regulatory roles in diverse cellular processes, where its functionality often derives from folding into and converting between structures. Many RNAs further rely on co-existence of alternative structures, which govern their response to cellular signals. However, characterizing heterogeneous landscapes is difficult, both experimentally and computationally. Recently, structure profiling experiments have emerged as powerful and affordable structure characterization methods, which improve computational structure prediction. To date, efforts have centered on predicting one optimal structure, with much less progress made on multiple-structure prediction. Here, we report a probabilistic modeling approach that predicts a parsimonious set of co-existing structures and estimates their abundances from structure profiling data. We demonstrate robust landscape reconstruction and quantitative insights into structural dynamics by analyzing numerous data sets. This work establishes a framework for data-directed characterization of structure landscapes to aid experimentalists in performing structure-function studies

    Developing Techniques for the Identification of Non-Canonical RNA Pairing and Analysis of LC-MS Datasets

    Get PDF
    Non-canonical pairing dynamics in ribonucleic acid (RNA) structureand statistical analysis of metabolomics liquid chromatography mass spectrometry (LC-MS) datasets are two difficult problems that stand as open challenges. RNA folding algorithms are used across a variety of disciplines to predict structures when experimental elucidation techniques are inconvenient or impractical. Though successful and widely adopted, folding algorithms make simplifying assumptions for loop regions due to their complex interactions and associated difficulty with generating energy parameters for relevant non-canonical pairing interactions. Modeling assumptions and a lack of energy parameters for loops limit accuracy in these functional critical regions of RNA. This work describes a new technique for probing non-canonical loop interactions through the combined analysis of dimethyl sulfate (DMS) and three-dimensional crystallographic data. We demonstrate that DMS data encodes information about non-canonical pairing which describes these interactions in an efficient, high throughput manner. Metabolomics aims to understand biological processes through the analysis of small molecule metabolites. The field primarily uses 1H nuclear magnetic resonance (NMR) spectroscopy as well as LC-MS to identify and quantitate metabolites. With even simple samples having hundreds or thousands of metabolites, researchers in the field have developed software pipelines to make metabolomics studies a tractable task. Numerous packages exist for the analysis of either 1H NMR or LC-MS data, but current offerings force researchers to use multiple packages to analyze both data types. To address the need for a metabolomics package capable of analyzing both, we have developed new LC-MS functionality for the NMR metabolomics package MVAPACK. Advisor: Joseph D. Yesselma

    The Folding Kinetics of RNA

    Get PDF
    RNAs are biomolecules ubiquitous in all living cells. Usually, they fold into complex molecular structures, which often mediate their biological function. In this work, models of RNA folding have been studied in detail. One can distinguish two fundamentally different approaches to RNA folding. The first one is the thermodynamic approach, which yields information about the distribution of structures in the ensemble in its equilibrium. The second approach, which is required to study the dynamics of folding during the course of time, is the kinetic folding analysis. It is much more computationally expensive, but allows to incorporate changing environmental parameters as well as time-dependent effects into the analysis. Building on these methods, the BarMap framework (Hofacker, Flamm, et al., 2010) allows to chain several pre-computed models and thus simulate folding reactions in a dynamically changing environment, e. g., to model co- transcriptional folding. However, there is no obvious way to identify spurious output, let alone assessing the quality of the simulation results. As a remedy, BarMap-QA, a semi-automatic software pipeline for the analysis of cotranscriptional folding, has been developed. For a given input sequence, it automatically generates the models for every step of the RNA elongation, applies BarMap to link them together, and runs the simulation. Post-processing scripts, visualizations, and an integrated viewer are provided to facilitate the evaluation of the unwieldy BarMap output. Three novel, complementary quality measures are computed on-the-fly, allowing the analyst to evaluate the coverage of the computed models, the exactness of the computed mapping between the individual states of each model, and the fraction of correctly mapped population during the simulation run. In case of deficiencies, the output is automatically re-rendered after parameter adjustment. Statistical evidence is presented that, even when coarse graining the ensemble, kinetic simulations quickly become infeasible for longer RNAs. However, within the individual gradient basins, most high-energy structures only have a marginal probability and could safely be excluded from the analysis. To tell relevant and irrelevant structures apart, a precise knowledge of the distribution of probability mass within a basin is necessary. Both a theoretical result concerning the shape of its density, and possible applications like the prediction of a basin’s partition function are given. To demonstrate the applicability of computational folding simulations to a real-world task of the life sciences, we conducted an in silico design process for a synthetic, transcriptional riboswitch responding to the ligand neomycin. The designed constructs were then transfected into the bacterium Escherichia coli by a collaborative partner and could successfully regulate a fluorescent reporter gene depending on the presence of its ligand. Additionally, it was shown that the sequence context of the riboswitch could have detrimental effects on its functionality, but also that RNA folding simulations are often capable to predict these interactions and provide solutions in the form of decoupling spacer elements. Taken together, this thesis offers the reader deep insights into the world of RNA folding and its models, and how these can be applied to design novel biomolecules

    RNA and protein 3D structure modeling: similarities and differences

    Get PDF
    In analogy to proteins, the function of RNA depends on its structure and dynamics, which are encoded in the linear sequence. While there are numerous methods for computational prediction of protein 3D structure from sequence, there have been very few such methods for RNA. This review discusses template-based and template-free approaches for macromolecular structure prediction, with special emphasis on comparison between the already tried-and-tested methods for protein structure modeling and the very recently developed “protein-like” modeling methods for RNA. We highlight analogies between many successful methods for modeling of these two types of biological macromolecules and argue that RNA 3D structure can be modeled using “protein-like” methodology. We also highlight the areas where the differences between RNA and proteins require the development of RNA-specific solutions

    Dissecting the secondary structure of the circular RNA of a nuclear viroid in vivo: A "naked" rod-like conformation similar but not identical to that observed in vitro

    Full text link
    [EN] With a minimal (250-400nt), non-protein-coding, circular RNA genome, viroids rely on sequence/structural motifs for replication and colonization of their host plants. These motifs are embedded in a compact secondary structure whose elucidation is crucial to understand how they function. Viroid RNA structure has been tackled in silico with algorithms searching for the conformation of minimal free energy, and in vitro by probing in solution with RNases, dimethyl sulphate and bisulphite, and with selective 2-hydroxyl acylation analyzed by primer extension (SHAPE), which interrogates the RNA backbone at single-nucleotide resolution. However, in vivo approaches at that resolution have not been assayed. Here, after confirming by 3 termodynamics-based predictions and by in vitro SHAPE that the secondary structure adopted by the infectious monomeric circular (+) RNA of potato spindle tuber viroid (PSTVd) is a rod-like conformation with double-stranded segments flanked by loops, we have probed it in vivo with a SHAPE modification. We provide direct evidence that a similar, but not identical, rod-like conformation exists in PSTVd-infected leaves of Nicotiana benthamiana, verifying the long-standing view that this RNA accumulates in planta as a naked form rather than tightly associated with protecting host proteins. However, certain nucleotides of the central conserved region, including some of the loop E involved in key functions such as replication, are more SHAPE-reactive in vitro than in vivo. This difference is most likely due to interactions with proteins mediating some of these functions, or to structural changes promoted by other factors of the in vivo habitat.This work was supported by grant BFU2014-56812-P (to R.F.) from the Ministerio de Economia y Competitividad of Spain. A.L.C. was the recipient of a predoctoral fellowship from the same organism.López-Carrasco, MA.; Flores Pedauye, R. (2017). Dissecting the secondary structure of the circular RNA of a nuclear viroid in vivo: A "naked" rod-like conformation similar but not identical to that observed in vitro. RNA Biology. 14(8):1046-1054. https://doi.org/10.1080/15476286.2016.1223005S1046105414

    ModeRNA: a tool for comparative modeling of RNA 3D structure

    Get PDF
    RNA is a large group of functionally important biomacromolecules. In striking analogy to proteins, the function of RNA depends on its structure and dynamics, which in turn is encoded in the linear sequence. However, while there are numerous methods for computational prediction of protein three-dimensional (3D) structure from sequence, with comparative modeling being the most reliable approach, there are very few such methods for RNA. Here, we present ModeRNA, a software tool for comparative modeling of RNA 3D structures. As an input, ModeRNA requires a 3D structure of a template RNA molecule, and a sequence alignment between the target to be modeled and the template. It must be emphasized that a good alignment is required for successful modeling, and for large and complex RNA molecules the development of a good alignment usually requires manual adjustments of the input data based on previous expertise of the respective RNA family. ModeRNA can model post-transcriptional modifications, a functionally important feature analogous to post-translational modifications in proteins. ModeRNA can also model DNA structures or use them as templates. It is equipped with many functions for merging fragments of different nucleic acid structures into a single model and analyzing their geometry. Windows and UNIX implementations of ModeRNA with comprehensive documentation and a tutorial are freely available

    Modeling RNA tertiary structure motifs by graph-grammars

    Get PDF
    A new approach, graph-grammars, to encode RNA tertiary structure patterns is introduced and exemplified with the classical sarcin–ricin motif. The sarcin–ricin motif is found in the stem of the crucial ribosomal loop E (also referred to as the sarcin–ricin loop), which is sensitive to the α-sarcin and ricin toxins. Here, we generate a graph-grammar for the sarcin-ricin motif and apply it to derive putative sequences that would fold in this motif. The biological relevance of the derived sequences is confirmed by a comparison with those found in known sarcin–ricin sites in an alignment of over 800 bacterial 23S ribosomal RNAs. The comparison raised alternative alignments in few sarcin–ricin sites, which were assessed using tertiary structure predictions and 3D modeling. The sarcin–ricin motif graph-grammar was built with indivisible nucleotide interaction cycles that were recently observed in structured RNAs. A comparison of the sequences and 3D structures of each cycle that constitute the sarcin–ricin motif gave us additional insights about RNA sequence–structure relationships. In particular, this analysis revealed the sequence space of an RNA motif depends on a structural context that goes beyond the single base pairing and base-stacking interactions

    Relationship between mRNA secondary structure and sequence variability in Chloroplast genes: possible life history implications

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Synonymous sites are freer to vary because of redundancy in genetic code. Messenger RNA secondary structure restricts this freedom, as revealed by previous findings in mitochondrial genes that mutations at third codon position nucleotides in helices are more selected against than those in loops. This motivated us to explore the constraints imposed by mRNA secondary structure on evolutionary variability at all codon positions in general, in chloroplast systems.</p> <p>Results</p> <p>We found that the evolutionary variability and intrinsic secondary structure stability of these sequences share an inverse relationship. Simulations of most likely single nucleotide evolution in <it>Psilotum nudum </it>and <it>Nephroselmis olivacea </it>mRNAs, indicate that helix-forming propensities of mutated mRNAs are greater than those of the natural mRNAs for short sequences and vice-versa for long sequences. Moreover, helix-forming propensity estimated by the percentage of total mRNA in helices increases gradually with mRNA length, saturating beyond 1000 nucleotides. Protection levels of functionally important sites vary across plants and proteins: <it>r</it>-strategists minimize mutation costs in large genes; <it>K</it>-strategists do the opposite.</p> <p>Conclusion</p> <p>Mrna length presumably predisposes shorter mRNAs to evolve under different constraints than longer mRNAs. The positive correlation between secondary structure protection and functional importance of sites suggests that some sites might be conserved due to packing-protection constraints at the nucleic acid level in addition to protein level constraints. Consequently, nucleic acid secondary structure <it>a priori </it>biases mutations. The converse (exposure of conserved sites) apparently occurs in a smaller number of cases, indicating a different evolutionary adaptive strategy in these plants. The differences between the protection levels of functionally important sites for <it>r</it>- and <it>K-</it>strategists reflect their respective molecular adaptive strategies. These converge with increasing domestication levels of <it>K</it>-strategists, perhaps because domestication increases reproductive output.</p

    Computational Methods for Comparative Non-coding RNA Analysis: from Secondary Structures to Tertiary Structures

    Get PDF
    Unlike message RNAs (mRNAs) whose information is encoded in the primary sequences, the cellular roles of non-coding RNAs (ncRNAs) originate from the structures. Therefore studying the structural conservation in ncRNAs is important to yield an in-depth understanding of their functionalities. In the past years, many computational methods have been proposed to analyze the common structural patterns in ncRNAs using comparative methods. However, the RNA structural comparison is not a trivial task, and the existing approaches still have numerous issues in efficiency and accuracy. In this dissertation, we will introduce a suite of novel computational tools that extend the classic models for ncRNA secondary and tertiary structure comparisons. For RNA secondary structure analysis, we first developed a computational tool, named PhyloRNAalifold, to integrate the phylogenetic information into the consensus structural folding. The underlying idea of this algorithm is that the importance of a co-varying mutation should be determined by its position on the phylogenetic tree. By assigning high scores to the critical covariances, the prediction of RNA secondary structure can be more accurate. Besides structure prediction, we also developed a computational tool, named ProbeAlign, to improve the efficiency of genome-wide ncRNA screening by using high-throughput RNA structural probing data. It treats the chemical reactivities embedded in the probing information as pairing attributes of the searching targets. This approach can avoid the time-consuming base pair matching in the secondary structure alignment. The application of ProbeAlign to the FragSeq datasets shows its capability of genome-wide ncRNAs analysis. For RNA tertiary structure analysis, we first developed a computational tool, named STAR3D, to find the global conservation in RNA 3D structures. STAR3D aims at finding the consensus of stacks by using 2D topology and 3D geometry together. Then, the loop regions can be ordered and aligned according to their relative positions in the consensus. This stack-guided alignment method adopts the divide-and-conquer strategy into RNA 3D structural alignment, which has improved its efficiency dramatically. Furthermore, we also have clustered all loop regions in non-redundant RNA 3D structures to de novo detect plausible RNA structural motifs. The computational pipeline, named RNAMSC, was extended to handle large-scale PDB datasets, and solid downstream analysis was performed to ensure the clustering results are valid and easily to be applied to further research. The final results contain many interesting variations of known motifs, such as GNAA tetraloop, kink-turn, sarcin-ricin and t-loops. We also discovered novel functional motifs that conserved in a wide range of ncRNAs, including ribosomal RNA, sgRNA, SRP RNA, GlmS riboswitch and twister ribozyme
    corecore