663 research outputs found

    RNAstructure: software for RNA secondary structure prediction and analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>To understand an RNA sequence's mechanism of action, the structure must be known. Furthermore, target RNA structure is an important consideration in the design of small interfering RNAs and antisense DNA oligonucleotides. RNA secondary structure prediction, using thermodynamics, can be used to develop hypotheses about the structure of an RNA sequence.</p> <p>Results</p> <p>RNAstructure is a software package for RNA secondary structure prediction and analysis. It uses thermodynamics and utilizes the most recent set of nearest neighbor parameters from the Turner group. It includes methods for secondary structure prediction (using several algorithms), prediction of base pair probabilities, bimolecular structure prediction, and prediction of a structure common to two sequences. This contribution describes new extensions to the package, including a library of C++ classes for incorporation into other programs, a user-friendly graphical user interface written in JAVA, and new Unix-style text interfaces. The original graphical user interface for Microsoft Windows is still maintained.</p> <p>Conclusion</p> <p>The extensions to RNAstructure serve to make RNA secondary structure prediction user-friendly. The package is available for download from the Mathews lab homepage at <url>http://rna.urmc.rochester.edu/RNAstructure.html</url>.</p

    Combinatorial RNA Design: Designability and Structure-Approximating Algorithm

    Get PDF
    In this work, we consider the Combinatorial RNA Design problem, a minimal instance of the RNA design problem which aims at finding a sequence that admits a given target as its unique base pair maximizing structure. We provide complete characterizations for the structures that can be designed using restricted alphabets. Under a classic four-letter alphabet, we provide a complete characterization of designable structures without unpaired bases. When unpaired bases are allowed, we provide partial characterizations for classes of designable/undesignable structures, and show that the class of designable structures is closed under the stutter operation. Membership of a given structure to any of the classes can be tested in linear time and, for positive instances, a solution can be found in linear time. Finally, we consider a structure-approximating version of the problem that allows to extend bands (helices) and, assuming that the input structure avoids two motifs, we provide a linear-time algorithm that produces a designable structure with at most twice more base pairs than the input structure.Comment: CPM - 26th Annual Symposium on Combinatorial Pattern Matching, Jun 2015, Ischia Island, Italy. LNCS, 201

    Improved prediction of RNA secondary structure by integrating the free energy model with restraints derived from experimental probing data.

    Get PDF
    PublishedEvaluation StudiesJournal ArticleResearch Support, Non-U.S. Gov'tRecently, several experimental techniques have emerged for probing RNA structures based on high-throughput sequencing. However, most secondary structure prediction tools that incorporate probing data are designed and optimized for particular types of experiments. For example, RNAstructure-Fold is optimized for SHAPE data, while SeqFold is optimized for PARS data. Here, we report a new RNA secondary structure prediction method, restrained MaxExpect (RME), which can incorporate multiple types of experimental probing data and is based on a free energy model and an MEA (maximizing expected accuracy) algorithm. We first demonstrated that RME substantially improved secondary structure prediction with perfect restraints (base pair information of known structures). Next, we collected structure-probing data from diverse experiments (e.g. SHAPE, PARS and DMS-seq) and transformed them into a unified set of pairing probabilities with a posterior probabilistic model. By using the probability scores as restraints in RME, we compared its secondary structure prediction performance with two other well-known tools, RNAstructure-Fold (based on a free energy minimization algorithm) and SeqFold (based on a sampling algorithm). For SHAPE data, RME and RNAstructure-Fold performed better than SeqFold, because they markedly altered the energy model with the experimental restraints. For high-throughput data (e.g. PARS and DMS-seq) with lower probing efficiency, the secondary structure prediction performances of the tested tools were comparable, with performance improvements for only a portion of the tested RNAs. However, when the effects of tertiary structure and protein interactions were removed, RME showed the highest prediction accuracy in the DMS-accessible regions by incorporating in vivo DMS-seq data.National Key Basic Research Program of China [2012CB316503]; National High-Tech Research and Development Program of China [2014AA021103]; National Natural Science Foundation of China [31271402]; Tsinghua University Initiative Scientific Research Program [2014z21045]; Hong Kong Research Grants Council Early Career Scheme [419612 to K.Y.]; National Science Foundation [1339282 to D.H.M.]; Computing Platform of the National Protein Facilities (Tsinghua University). Funding for open access charge: National Natural Science Foundation of China [31271402]

    Can Clustal-style progressive pairwise alignment of multiple sequences be used in RNA secondary structure prediction?

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In ribonucleic acid (RNA) molecules whose function depends on their final, folded three-dimensional shape (such as those in ribosomes or spliceosome complexes), the secondary structure, defined by the set of internal basepair interactions, is more consistently conserved than the primary structure, defined by the sequence of nucleotides.</p> <p>Results</p> <p>The research presented here investigates the possibility of applying a progressive, pairwise approach to the alignment of multiple RNA sequences by simultaneously predicting an energy-optimized consensus secondary structure. We take an existing algorithm for finding the secondary structure common to two RNA sequences, Dynalign, and alter it to align profiles of multiple sequences. We then explore the relative successes of different approaches to designing the tree that will guide progressive alignments of sequence profiles to create a multiple alignment and prediction of conserved structure.</p> <p>Conclusion</p> <p>We have found that applying a progressive, pairwise approach to the alignment of multiple ribonucleic acid sequences produces highly reliable predictions of conserved basepairs, and we have shown how these predictions can be used as constraints to improve the results of a single-sequence structure prediction algorithm. However, we have also discovered that the amount of detail included in a consensus structure prediction is highly dependent on the order in which sequences are added to the alignment (the guide tree), and that if a consensus structure does not have sufficient detail, it is less likely to provide useful constraints for the single-sequence method.</p

    Fine-grained parallel RNAalifold algorithm for RNA secondary structure prediction on FPGA

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the field of RNA secondary structure prediction, the RNAalifold algorithm is one of the most popular methods using free energy minimization. However, general-purpose computers including parallel computers or multi-core computers exhibit parallel efficiency of no more than 50%. Field Programmable Gate-Array (FPGA) chips provide a new approach to accelerate RNAalifold by exploiting fine-grained custom design.</p> <p>Results</p> <p>RNAalifold shows complicated data dependences, in which the dependence distance is variable, and the dependence direction is also across two dimensions. We propose a systolic array structure including one master Processing Element (PE) and multiple slave PEs for fine grain hardware implementation on FPGA. We exploit data reuse schemes to reduce the need to load energy matrices from external memory. We also propose several methods to reduce energy table parameter size by 80%.</p> <p>Conclusion</p> <p>To our knowledge, our implementation with 16 PEs is the only FPGA accelerator implementing the complete RNAalifold algorithm. The experimental results show a factor of 12.2 speedup over the RNAalifold (<it>ViennaPackage </it>– 1.6.5) software for a group of aligned RNA sequences with 2981-residue running on a Personal Computer (PC) platform with Pentium 4 2.6 GHz CPU.</p

    Sorting live stem cells based on Sox2 mRNA expression.

    Get PDF
    PMCID: PMC3507951This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.While cell sorting usually relies on cell-surface protein markers, molecular beacons (MBs) offer the potential to sort cells based on the presence of any expressed mRNA and in principle could be extremely useful to sort rare cell populations from primary isolates. We show here how stem cells can be purified from mixed cell populations by sorting based on MBs. Specifically, we designed molecular beacons targeting Sox2, a well-known stem cell marker for murine embryonic (mES) and neural stem cells (NSC). One of our designed molecular beacons displayed an increase in fluorescence compared to a nonspecific molecular beacon both in vitro and in vivo when tested in mES and NSCs. We sorted Sox2-MB(+)SSEA1(+) cells from a mixed population of 4-day retinoic acid-treated mES cells and effectively isolated live undifferentiated stem cells. Additionally, Sox2-MB(+) cells isolated from primary mouse brains were sorted and generated neurospheres with higher efficiency than Sox2-MB(-) cells. These results demonstrate the utility of MBs for stem cell sorting in an mRNA-specific manner

    Analysis of the EIAV Rev-Responsive Element (RRE) Reveals a Conserved RNA Motif Required for High Affinity Rev Binding in Both HIV-1 and EIAV

    Get PDF
    A cis-acting RNA regulatory element, the Rev-responsive element (RRE), has essential roles in replication of lentiviruses, including human immunodeficiency virus (HIV-1) and equine infection anemia virus (EIAV). The RRE binds the viral trans-acting regulatory protein, Rev, to mediate nucleocytoplasmic transport of incompletely spliced mRNAs encoding viral structural genes and genomic RNA. Because of its potential as a clinical target, RRE-Rev interactions have been well studied in HIV-1; however, detailed molecular structures of Rev-RRE complexes in other lentiviruses are still lacking. In this study, we investigate the secondary structure of the EIAV RRE and interrogate regulatory protein-RNA interactions in EIAV Rev-RRE complexes. Computational prediction and detailed chemical probing and footprinting experiments were used to determine the RNA secondary structure of EIAV RRE-1, a 555 nt region that provides RRE function in vivo. Chemical probing experiments confirmed the presence of several predicted loop and stem-loop structures, which are conserved among 140 EIAV sequence variants. Footprinting experiments revealed that Rev binding induces significant structural rearrangement in two conserved domains characterized by stable stem-loop structures. Rev binding region-1 (RBR-1) corresponds to a genetically-defined Rev binding region that overlaps exon 1 of the EIAV rev gene and contains an exonic splicing enhancer (ESE). RBR-2, characterized for the first time in this study, is required for high affinity binding of EIAV Rev to the RRE. RBR-2 contains an RNA structural motif that is also found within the high affinity Rev binding site in HIV-1 (stem-loop IIB), and within or near mapped RRE regions of four additional lentiviruses. The powerful integration of computational and experimental approaches in this study has generated a validated RNA secondary structure for the EIAV RRE and provided provocative evidence that high affinity Rev binding sites of HIV-1 and EIAV share a conserved RNA structural motif. The presence of this motif in phylogenetically divergent lentiviruses suggests that it may play a role in highly conserved interactions that could be targeted in novel anti-lentiviral therapies

    An analysis of simple computational strategies to facilitate the design of functional molecular information processors

    Get PDF
    BACKGROUND: Biological macromolecules (DNA, RNA and proteins) are capable of processing physical or chemical inputs to generate outputs that parallel conventional Boolean logical operators. However, the design of functional modules that will enable these macromolecules to operate as synthetic molecular computing devices is challenging. RESULTS: Using three simple heuristics, we designed RNA sensors that can mimic the function of a seven-segment display (SSD). Ten independent and orthogonal sensors representing the numerals 0 to 9 are designed and constructed. Each sensor has its own unique oligonucleotide binding site region that is activated uniquely by a specific input. Each operator was subjected to a stringent in silico filtering. Random sensors were selected and functionally validated via ribozyme self cleavage assays that were visualized via electrophoresis. CONCLUSIONS: By utilising simple permutation and randomisation in the sequence design phase, we have developed functional RNA sensors thus demonstrating that even the simplest of computational methods can greatly aid the design phase for constructing functional molecular devices. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1297-x) contains supplementary material, which is available to authorized users

    A mutate-and-map protocol for inferring base pairs in structured RNA

    Full text link
    Chemical mapping is a widespread technique for structural analysis of nucleic acids in which a molecule's reactivity to different probes is quantified at single-nucleotide resolution and used to constrain structural modeling. This experimental framework has been extensively revisited in the past decade with new strategies for high-throughput read-outs, chemical modification, and rapid data analysis. Recently, we have coupled the technique to high-throughput mutagenesis. Point mutations of a base-paired nucleotide can lead to exposure of not only that nucleotide but also its interaction partner. Carrying out the mutation and mapping for the entire system gives an experimental approximation of the molecules contact map. Here, we give our in-house protocol for this mutate-and-map strategy, based on 96-well capillary electrophoresis, and we provide practical tips on interpreting the data to infer nucleic acid structure.Comment: 22 pages, 5 figure

    Antisense DNA parameters derived from next-nearest-neighbor analysis of experimental data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The enumeration of tetrameric and other sequence motifs that are positively or negatively correlated with <it>in vivo </it>antisense DNA effects has been a useful addition to the arsenal of information needed to predict effective targets for antisense DNA control of gene expression. Such retrospective information derived from <it>in vivo </it>cellular experiments characterizes aspects of the sequence dependence of antisense inhibition that are not predicted by nearest-neighbor (NN) thermodynamic parameters derived from <it>in vitro </it>experiments. However, quantitation of the antisense contributions of motifs is problematic, since individual motifs are not isolated from the effects of neighboring nucleotides, and motifs may be overlapping. These problems are circumvented by a next-nearest-neighbor (NNN) analysis of antisense DNA effects in which the overlapping nature of nearest-neighbors is taken into account.</p> <p>Results</p> <p>Next-nearest-neighbor triplet combinations of nucleotides are the simplest that include overlapping sequence effects and therefore can encompass interactions beyond those of nearest neighbors. We used singular value decomposition (SVD) to fit experimental data from our laboratory in which phosphorothioate-modified antisense DNAs (S-DNAs) 20 nucleotides long were used to inhibit cellular protein expression in 112 experiments involving four gene targets and two cell lines. Data were fitted using a NNN model, neglecting end effects, to derive NNN inhibition parameters that could be combined to give parameters for a set of 49 sequences that represents the inhibitory effects of all possible overlapping triplet interactions in the cellular targets of these antisense S-DNAs. We also show that parameters to describe subsets of the data, such as the mRNAs being targeted and the cell lines used, can be included in such a derivation. While NNN triplet parameters provided an adequate model to fit our data, NN doublet parameters did not.</p> <p>Conclusions</p> <p>The methodology presented illustrates how NNN antisense inhibitory information can be derived from <it>in vivo </it>cellular experiments. Subsequent calculations of the antisense inhibitory parameters for any mRNA target sequence automatically take into account the effects of all possible overlapping combinations of nearest-neighbors in the sequence. This procedure is more robust than the tallying of tetrameric motifs that have positive or negative antisense effects. The specific parameters derived in this work are limited in their applicability by the relatively small database of experiments that was used in their derivation.</p
    corecore