172 research outputs found
Predicting RNA secondary structure by the comparative approach: how to select the homologous sequences
<p>Abstract</p> <p>Background</p> <p>The secondary structure of an RNA must be known before the relationship between its structure and function can be determined. One way to predict the secondary structure of an RNA is to identify covarying residues that maintain the pairings (Watson-Crick, Wobble and non-canonical pairings). This "comparative approach" consists of identifying mutations from homologous sequence alignments. The sequences must covary enough for compensatory mutations to be revealed, but comparison is difficult if they are too different. Thus the choice of homologous sequences is critical. While many possible combinations of homologous sequences may be used for prediction, only a few will give good structure predictions. This can be due to poor quality alignment in stems or to the variability of certain sequences. This problem of sequence selection is currently unsolved.</p> <p>Results</p> <p>This paper describes an algorithm, <it>SSCA</it>, which measures the suitability of sequences for the comparative approach. It is based on evolutionary models with structure constraints, particularly those on sequence variations and stem alignment. We propose three models, based on different constraints on sequence alignments. We show the results of the <it>SSCA </it>algorithm for predicting the secondary structure of several RNAs. <it>SSCA </it>enabled us to choose sets of homologous sequences that gave better predictions than arbitrarily chosen sets of homologous sequences.</p> <p>Conclusion</p> <p><it>SSCA </it>is an algorithm for selecting combinations of RNA homologous sequences suitable for secondary structure predictions with the comparative approach.</p
Mutational Patterns in RNA Secondary Structure Evolution Examined in Three RNA Families
The goal of this work was to study mutational patterns in the evolution of RNA secondary structure. We analyzed bacterial tmRNA, RNaseP and eukaryotic telomerase RNA secondary structures, mapping structural variability onto phylogenetic trees constructed primarily from rRNA sequences. We found that secondary structures evolve both by whole stem insertion/deletion, and by mutations that create or disrupt stem base pairing. We analyzed the evolution of stem lengths and constructed substitution matrices describing the changes responsible for the variation in the RNA stem length. In addition, we used principal component analysis of the stem length data to determine the most variable stems in different families of RNA. This data provides new insights into the evolution of RNA secondary structures and patterns of variation in the lengths of double helical regions of RNA molecules. Our findings will facilitate design of improved mutational models for RNA structure evolution
Identification and comparative analysis of components from the signal recognition particle in protozoa and fungi
BACKGROUND: The signal recognition particle (SRP) is a ribonucleoprotein complex responsible for targeting proteins to the ER membrane. The SRP of metazoans is well characterized and composed of an RNA molecule and six polypeptides. The particle is organized into the S and Alu domains. The Alu domain has a translational arrest function and consists of the SRP9 and SRP14 proteins bound to the terminal regions of the SRP RNA. So far, our understanding of the SRP and its evolution in lower eukaryotes such as protozoa and yeasts has been limited. However, genome sequences of such organisms have recently become available, and we have now analyzed this information with respect to genes encoding SRP components. RESULTS: A number of SRP RNA and SRP protein genes were identified by an analysis of genomes of protozoa and fungi. The sequences and secondary structures of the Alu portion of the RNA were found to be highly variable. Furthermore, proteins SRP9/14 appeared to be absent in certain species. Comparative analysis of the SRP RNAs from different Saccharomyces species resulted in models which contain features shared between all SRP RNAs, but also a new secondary structure element in SRP RNA helix 5. Protein SRP21, previously thought to be present only in Saccharomyces, was shown to be a constituent of additional fungal genomes. Furthermore, SRP21 was found to be related to metazoan and plant SRP9, suggesting that the two proteins are functionally related. CONCLUSIONS: Analysis of a number of not previously annotated SRP components show that the SRP Alu domain is subject to a more rapid evolution than the other parts of the molecule. For instance, the RNA portion is highly variable and the protein SRP9 seems to have evolved into the SRP21 protein in fungi. In addition, we identified a secondary structure element in the Sacccharomyces RNA that has been inserted close to the Alu region. Together, these results provide important clues as to the structure, function and evolution of SRP
An Introduction to RNA Databases
We present an introduction to RNA databases. The history and technology
behind RNA databases is briefly discussed. We examine differing methods of data
collection and curation, and discuss their impact on both the scope and
accuracy of the resulting databases. Finally, we demonstrate these principals
through detailed examination of four leading RNA databases: Noncode, miRBase,
Rfam, and SILVA.Comment: 27 pages, 10 figures, 1 tables. Submitted as a chapter for "An
introduction to RNA bioinformatics" to be published by "Methods in Molecular
Biology
Diversity of 23S rRNA Genes within Individual Prokaryotic Genomes
The concept of ribosomal constraints on rRNA genes is deduced primarily based on the comparison of consensus rRNA sequences between closely related species, but recent advances in whole-genome sequencing allow evaluation of this concept within organisms with multiple rRNA operons. was the only species in which intragenomic diversity >3% was observed among 4 paralogous 23S rRNA genes.These findings indicate tight ribosomal constraints on individual 23S rRNA genes within a genome. Although classification using primary 23S rRNA sequences could be erroneous, significant diversity among paralogous 23S rRNA genes was observed only once in the 184 species analyzed, indicating little overall impact on the mainstream of 23S rRNA gene-based prokaryotic taxonomy
Structural Constraints Identified with Covariation Analysis in Ribosomal RNA
Covariation analysis is used to identify those positions with similar patterns of sequence variation in an alignment of RNA sequences. These constraints on the evolution of two positions are usually associated with a base pair in a helix. While mutual information (MI) has been used to accurately predict an RNA secondary structure and a few of its tertiary interactions, early studies revealed that phylogenetic event counting methods are more sensitive and provide extra confidence in the prediction of base pairs. We developed a novel and powerful phylogenetic events counting method (PEC) for quantifying positional covariation with the Gutell lab’s new RNA Comparative Analysis Database (rCAD). The PEC and MI-based methods each identify unique base pairs, and jointly identify many other base pairs. In total, both methods in combination with an N-best and helix-extension strategy identify the maximal number of base pairs. While covariation methods have effectively and accurately predicted RNAs secondary structure, only a few tertiary structure base pairs have been identified. Analysis presented herein and at the Gutell lab’s Comparative RNA Web (CRW) Site reveal that the majority of these latter base pairs do not covary with one another. However, covariation analysis does reveal a weaker although significant covariation between sets of nucleotides that are in proximity in the three-dimensional RNA structure. This reveals that covariation analysis identifies other types of structural constraints beyond the two nucleotides that form a base pair
Effects of Restrained Sampling Space and Nonplanar Amino Groups on Free-Energy Predictions for RNA with Imino and Sheared Tandem GA Base Pairs Flanked by GC, CG, iGiC or iCiG Base Pairs
Guanine-adenine (GA) base pairs play important roles in determining the structure, dynamics, and stability of RNA. In RNA internal loops, GA base pairs often occur in tandem arrangements and their structure is context and sequence dependent. Calculations reported here test the thermodynamic integration (TI) approach with the amber99 force field by comparing computational predictions of free energy differences with the free energy differences expected on the basis of NMR determined structures of the RNA motifs (5′-GCGGACGC-3′)2, (5′-GCiGGAiCGC-3′)2, (5′-GGCGAGCC-3′)2, and (5′-GGiCGAiGCC-3′)2. Here, iG and iC denote isoguanosine and isocytidine, which have amino and carbonyl groups transposed relative to guanosine and cytidine. The NMR structures show that the GA base pairs adopt either imino (cis Watson−Crick/Watson−Crick A-G) or sheared (trans Hoogsteen/Sugar edge A-G) conformations depending on the identity and orientation of the adjacent base pair. A new mixing function for the TI method is developed that allows alchemical transitions in which atoms can disappear in both the initial and final states. Unrestrained calculations gave ΔG° values 2−4 kcal/mol different from expectations based on NMR data. Restraining the structures with hydrogen bond restraints did not improve the predictions. Agreement with NMR data was improved by 0.7 to 1.5 kcal/mol, however, when structures were restrained with weak positional restraints to sample around the experimentally determined NMR structures. The amber99 force field was modified to partially include pyramidalization effects of the unpaired amino group of guanosine in imino GA base pairs. This provided little or no improvement in comparisons with experiment. The marginal improvement is observed when the structure has potential cross-strand out-of-plane hydrogen bonding with the G amino group. The calculations using positional restraints and a nonplanar amino group reproduce the signs of ΔG° from the experimental results and are, thus, capable of providing useful qualitative insights complementing the NMR experiments. Decomposition of the terms in the calculations reveals that the dominant terms are from electrostatic and interstrand interactions other than hydrogen bonds in the base pairs. The results suggest that a better description of the backbone is key to reproducing the experimental free energy results with computational free energy predictions
Characteristics of the nuclear (18S, 5.8S, 28S and 5S) and mitochondrial (12S and 16S) rRNA genes of Apis mellifera (Insecta: Hymenoptera): structure, organization, and retrotransposable elements
As an accompanying manuscript to the release of the honey bee genome, we report the entire sequence of the nuclear (18S, 5.8S, 28S and 5S) and mitochondrial (12S and 16S) ribosomal RNA (rRNA)-encoding gene sequences (rDNA) and related internally and externally transcribed spacer regions of Apis mellifera (Insecta: Hymenoptera: Apocrita). Additionally, we predict secondary structures for the mature rRNA molecules based on comparative sequence analyses with other arthropod taxa and reference to recently published crystal structures of the ribosome. In general, the structures of honey bee rRNAs are in agreement with previously predicted rRNA models from other arthropods in core regions of the rRNA, with little additional expansion in non-conserved regions. Our multiple sequence alignments are made available on several public databases and provide a preliminary establishment of a global structural model of all rRNAs from the insects. Additionally, we provide conserved stretches of sequences flanking the rDNA cistrons that comprise the externally transcribed spacer regions (ETS) and part of the intergenic spacer region (IGS), including several repetitive motifs. Finally, we report the occurrence of retrotransposition in the nuclear large subunit rDNA, as R2 elements are present in the usual insertion points found in other arthropods. Interestingly, functional R1 elements usually present in the genomes of insects were not detected in the honey bee rRNA genes. The reverse transcriptase products of the R2 elements are deduced from their putative open reading frames and structurally aligned with those from another hymenopteran insect, the jewel wasp Nasonia (Pteromalidae). Stretches of conserved amino acids shared between Apis and Nasonia are illustrated and serve as potential sites for primer design, as target amplicons within these R2 elements may serve as novel phylogenetic markers for Hymenoptera. Given the impending completion of the sequencing of the Nasonia genome, we expect our report eventually to shed light on the evolution of the hymenopteran genome within higher insects, particularly regarding the relative maintenance of conserved rDNA genes, related variable spacer regions and retrotransposable elements
- …