Skip to main content
Article thumbnail
Location of Repository

RNA structure prediction from evolutionary patterns of nucleotide composition

By S. Smit, R. Knight and J. Heringa


Structural elements in RNA molecules have a distinct nucleotide composition, which changes gradually over evolutionary time. We discovered certain features of these compositional patterns that are shared between all RNA families. Based on this information, we developed a structure prediction method that evaluates candidate structures for a set of homologous RNAs on their ability to reproduce the patterns exhibited by biological structures. The method is named SPuNC for ‘Structure Prediction using Nucleotide Composition’. In a performance test on a diverse set of RNA families we demonstrate that the SPuNC algorithm succeeds in selecting the most realistic structures in an ensemble. The average accuracy of top-scoring structures is significantly higher than the average accuracy of all ensemble members (improvements of more than 20% observed). In addition, a consensus structure that includes the most reliable base pairs gleaned from a set of top-scoring structures is generally more accurate than a consensus derived from the full structural ensemble. Our method achieves better accuracy than existing methods on several RNA families, including novel riboswitches and ribozymes. The results clearly show that nucleotide composition can be used to reveal the quality of RNA structures and thus the presented technique should be added to the set of prediction tools

Topics: Computational Biology
Publisher: Oxford University Press
OAI identifier:
Provided by: PubMed Central
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://www.pubmedcentral.nih.g... (external link)
  • Suggested articles


    1. (2002). 5S Ribosomal RNA Database.
    2. (2004). A comprehensive comparison of comparative RNA structure prediction approaches.
    3. (2003). A statistical sampling algorithm for RNA secondary structure prediction.
    4. (2004). Abstract shapes of RNA.
    5. (2000). Assessing the accuracy of prediction algorithms for classifi-cation: an overview.
    6. (2006). Automated extraction and classification of RNA tertiary structure cyclic motifs.
    7. (2004). BayesFold: rational 28 folds that combine thermodynamic, covariation, and chemical data for aligned RNA sequences.
    8. (2006). Beyond Mfold: recent advances in RNA bioinformatics.
    9. (2000). Calculating nucleic acid secondary structure.
    10. (2005). Compilation of tRNA sequences and sequences of tRNA genes.
    11. (1999). Complete suboptimal folding of RNA and the stability of secondary structures.
    12. (2001). Computational methods for RNA structure determination.
    13. (2005). Consensus shapes: an alternative to the Sankoff algorithm for RNA consensus structure prediction.
    14. (2001). Discovering common stem-loop motifs in unaligned RNA sequences.
    15. (1971). Estimation of secondary structure in ribonucleic acids.
    16. (1994). Fast folding and comparison of RNA secondary structures.
    17. (1997). Global similarities in nucleotide base composition among disparate functional classes of single-stranded RNA imply adaptive evolutionary convergence.
    18. (2004). Incorporating chemical modification Nucleic Acids Research,2009,
    19. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput.
    20. (2006). Natural selection is not required to explain universal compositional patterns in rRNA secondary structure categories.
    21. (2001). Non-coding RNA genes and the modern RNA world.
    22. (1981). Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information.
    23. (2006). Prediction of RNA secondary structure by free energy minimization.
    24. (2001). Quantitative analysis of nucleic acid three-dimensional structures.
    25. (2006). Revolutions in RNA secondary structure prediction.
    26. (2005). Rfam: annotating noncoding RNAs in complete genomes.
    27. (2005). RNA secondary structure prediction by centroids in a Boltzmann weighted ensemble.
    28. (2006). RNAshapes: an integrated RNA analysis package based on abstract shapes.
    29. (2002). Secondary structure prediction for aligned RNA sequences.
    30. (2006). Statistical and Bayesian approaches to RNA secondary structure prediction.
    31. (2002). The accuracy of ribosomal RNA comparative structure models.
    32. (2002). The comparative RNA web (CRW) site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs.
    33. (1990). The equilibrium partition function and base pair binding probabilities for RNA secondary structure.
    34. (2008). The MC-Fold and MC-Sym pipeline infers RNA structure from sequence data.
    35. (2000). The Protein Data Bank.
    36. (2003). Tools for the automatic identification and classification of RNA base pairs.
    37. (2004). Using an RNA secondary structure partition function to determine confidence in base pairs predicted by free energy minimization.

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.