Skip to main content
Article thumbnail
Location of Repository

Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines

By Xing Xu, Yongmei Ji and Gary D. Stormo

Abstract

An increasing number of cis-regulatory RNA elements have been found to regulate gene expression post-transcriptionally in various biological processes in bacterial systems. Effective computational tools for large-scale identification of novel regulatory RNAs are strongly desired to facilitate our exploration of gene regulation mechanisms and regulatory networks. We present a new computational program named RSSVM (RNA Sampler+Support Vector Machine), which employs Support Vector Machines (SVMs) for efficient identification of functional RNA motifs from random RNA secondary structures. RSSVM uses a set of distinctive features to represent the common RNA secondary structure and structural alignment predicted by RNA Sampler, a tool for accurate common RNA secondary structure prediction, and is trained with functional RNAs from a variety of bacterial RNA motif/gene families covering a wide range of sequence identities. When tested on a large number of known and random RNA motifs, RSSVM shows a significantly higher sensitivity than other leading RNA identification programs while maintaining the same false positive rate. RSSVM performs particularly well on sets with low sequence identities. The combination of RNA Sampler and RSSVM provides a new, fast, and efficient pipeline for large-scale discovery of regulatory RNA motifs. We applied RSSVM to multiple Shewanella genomes and identified putative regulatory RNA motifs in the 5′ untranslated regions (UTRs) in S. oneidensis, an important bacterial organism with extraordinary respiratory and metal reducing abilities and great potential for bioremediation and alternative energy generation. From 1002 sets of 5′-UTRs of orthologous operons, we identified 166 putative regulatory RNA motifs, including 17 of the 19 known RNA motifs from Rfam, an additional 21 RNA motifs that are supported by literature evidence, 72 RNA motifs overlapping predicted transcription terminators or attenuators, and other candidate regulatory RNA motifs. Our study provides a list of promising novel regulatory RNA motifs potentially involved in post-transcriptional gene regulation. Combined with the previous cis-regulatory DNA motif study in S. oneidensis, this genome-wide discovery of cis-regulatory RNA motifs may offer more comprehensive views of gene regulation at a different level in this organism. The RSSVM software, predictions, and analysis results on Shewanella genomes are available at http://ural.wustl.edu/resources.html#RSSVM

Topics: Research Article
Publisher: Public Library of Science
OAI identifier: oai:pubmedcentral.nih.gov:2659441
Provided by: PubMed Central
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://www.pubmedcentral.nih.g... (external link)
  • http://ural.wustl.edu/resource... (external link)
  • Suggested articles

    Citations

    1. (2007). A computational pipeline for high- throughput discovery of cis-regulatory noncoding RNA in prokaryotes.
    2. (2004). A graph theoretical approach for predicting common RNA secondary structure motifs including pseudoknots in unaligned sequences.
    3. (2008). A structural understanding of the dynamic ribosome machine.
    4. (1996). Aminoacyl-tRNA synthetase gene regulation in Bacillus subtilis.
    5. (2003). Antisense and RNAi: powerful tools in drug target discovery and validation.
    6. (2004). Attenuation regulation of amino acid biosynthetic operons in proteobacteria: comparative genomics analysis.
    7. (1980). Autogenous and post-transcriptional regulation of RNA polymerase synthesis.
    8. (1990). Basic local alignment search tool.
    9. (2008). Characterization of the Shewanella oneidensis Fur gene: roles in iron and acid tolerance response.
    10. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.
    11. (2006). CMfinder—a covariance model based RNA motif finding algorithm.
    12. (2003). Computational identification of non-coding RNAs in Saccharomyces cerevisiae by comparative genomics.
    13. (1994). Coupling between mRNA synthesis and mRNA stability in Escherichia coli.
    14. (2006). Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change.
    15. (1997). Displaying the information contents of structural RNA alignments: the structure logos.
    16. (2000). DNA binding sites: representation and discovery.
    17. (2001). Do mRNAs act as direct sensors of small molecules to control their expression?
    18. (2005). Fast and reliable prediction of noncoding RNAs.
    19. (1997). Finding the most significant common sequence and structure motifs in a set of RNA sequences.
    20. Gelfand MS(2004) Riboswitches:the oldest mechanism for the regulation of gene expression?
    21. (2002). Gene and protein expression profiles of Shewanella oneidensis during anaerobic growth with different electron acceptors.
    22. (2002). Genome sequence of the dissimilatory metal ion-reducing bacterium Shewanella oneidensis.
    23. (2004). Global regulation of virulence and the stress response by CsrA in the highly adapted human gastric pathogen Helicobacter pylori.
    24. (2005). Global transcriptional profiling of Shewanella oneidensis MR-1 during Cr(VI) and U(VI) reduction.
    25. (2007). Identification of 22 candidate structured RNAs in bacteria using the CMfinder comparative genomics pipeline.
    26. (2007). Ishihama A
    27. (2001). LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/˜cjlin/libsvm.
    28. (2003). Mfold web server for nucleic acid folding and hybridization prediction.
    29. (2002). Microarray transcription profiling of a Shewanella oneidensis etrA mutant.
    30. (2004). MicroRNAs: small RNAs with a big role in gene regulation.
    31. (2007). Multiple structural alignment and clustering of RNA sequences.
    32. (1999). No evidence that mRNAs have lower folding free energies than random sequences with the same dinucleotide distribution.
    33. (2001). Non-coding RNA genes and the modern RNA world.
    34. (2001). Noncoding RNA gene detection using comparative sequence analysis.
    35. (2000). Operons in Escherichia coli: genomic analyses and predictions.
    36. (2000). Post-transcriptional control by global regulators of gene expression in bacteria.
    37. (2007). Query-dependent banding (QDB) for faster RNA similarity searches.
    38. (2008). R-Coffee: a method for multiple alignment of non-coding RNA.
    39. (2002). Regulation by transcription attenuation in bacteria: how RNA provides instructions for transcription termination/ antitermination decisions.
    40. (1980). Regulation of aromatic amino acid biosynthesis in Escherichia coli K-12: control of the aroF-tyrA operon in the absence of repression control.
    41. (1999). Regulation of expression of the adhE gene, encoding ethanol oxidoreductase in Escherichia coli: transcription from a downstream promoter and regulation by fnr and RpoS.
    42. (2001). Regulation of the ldhA gene, encoding the fermentative lactate dehydrogenase of Escherichia coli.
    43. (2005). Rfam: annotating non-coding RNAs in complete genomes.
    44. (1986). Ribosomal genes in Escherichia coli.
    45. (2005). Riboswitches and the role of noncoding RNAs in bacterial metabolic control.
    46. (2006). Riboswitches as antibacterial drug targets.
    47. (2003). Riboswitches control fundamental biochemical pathways in Bacillus subtilis and other bacteria.
    48. (2001). Ribozyme structures and mechanisms.
    49. (2007). RNA consensus structure prediction with RNAalifold.
    50. (2007). RNA Sampler: a new sampling based algorithm for common RNA secondary structure prediction and structural alignment.
    51. (2006). RNAi therapeutics: a potential new class of pharmaceutical drugs.
    52. (2006). Rnall: an efficient algorithm for predicting RNA local secondary structural landscape in genomes.
    53. (2006). RNomics: identification and function of small nonprotein-coding RNAs in model organisms.
    54. (2000). Secondary structure alone is generally not statistically significant for the detection of noncoding RNAs.
    55. (2002). Secondary structure prediction for aligned RNA sequences.
    56. (2002). Stealth regulation: biological circuits with small RNA switches.
    57. (1981). Structure and regulation of aroH, the structural gene for the tryptophan-repressible 3-deoxyD-arabino-heptulosonic acid-7-phosphate synthetase of Escherichia coli.
    58. (2006). Structures of regulatory elements in mRNAs.
    59. (2005). The catalytic diversity of RNAs.
    60. (2008). The cis-regulatory map of Shewanella genomes.
    61. (2004). The outer membrane protein Omp35 affects the reduction of Fe(III), nitrate, and fumarate by Shewanella oneidensis MR-1.
    62. (1999). The RNA World: The Nature of Modern RNA Suggests a Prebiotic RNA.
    63. (2005). Transcription attenuation: a highly conserved regulatory strategy used by bacteria.
    64. (1992). Transcription frequency modulates the efficiency of an attenuator preceding the rpoBC RNA polymerase genes of Escherichia coli: possible autogenous control.
    65. (1985). Two control systems modulate the level of glutaminyl-tRNA synthetase in Escherichia coli.
    66. (2008). WAR: Webserver for aligning structural RNAs.

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.