Search CORE

8 research outputs found

Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines

An increasing number of cis-regulatory RNA elements have been found to regulate gene expression post-transcriptionally in various biological processes in bacterial systems. Effective computational tools for large-scale identification of novel regulatory RNAs are strongly desired to facilitate our exploration of gene regulation mechanisms and regulatory networks. We present a new computational program named RSSVM (RNA Sampler+Support Vector Machine), which employs Support Vector Machines (SVMs) for efficient identification of functional RNA motifs from random RNA secondary structures. RSSVM uses a set of distinctive features to represent the common RNA secondary structure and structural alignment predicted by RNA Sampler, a tool for accurate common RNA secondary structure prediction, and is trained with functional RNAs from a variety of bacterial RNA motif/gene families covering a wide range of sequence identities. When tested on a large number of known and random RNA motifs, RSSVM shows a significantly higher sensitivity than other leading RNA identification programs while maintaining the same false positive rate. RSSVM performs particularly well on sets with low sequence identities. The combination of RNA Sampler and RSSVM provides a new, fast, and efficient pipeline for large-scale discovery of regulatory RNA motifs. We applied RSSVM to multiple Shewanella genomes and identified putative regulatory RNA motifs in the 5′ untranslated regions (UTRs) in S. oneidensis, an important bacterial organism with extraordinary respiratory and metal reducing abilities and great potential for bioremediation and alternative energy generation. From 1002 sets of 5′-UTRs of orthologous operons, we identified 166 putative regulatory RNA motifs, including 17 of the 19 known RNA motifs from Rfam, an additional 21 RNA motifs that are supported by literature evidence, 72 RNA motifs overlapping predicted transcription terminators or attenuators, and other candidate regulatory RNA motifs. Our study provides a list of promising novel regulatory RNA motifs potentially involved in post-transcriptional gene regulation. Combined with the previous cis-regulatory DNA motif study in S. oneidensis, this genome-wide discovery of cis-regulatory RNA motifs may offer more comprehensive views of gene regulation at a different level in this organism. The RSSVM software, predictions, and analysis results on Shewanella genomes are available at http://ural.wustl.edu/resources.html#RSSVM

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Digital Commons@Becker

Identification and Characterization of Prokaryotic Regulatory Networks: Final Report

Author: Stormo Gary D.
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 05/10/2012
Field of study

We have completed our characterization of both the transcriptional regulatory network and post-transcriptional regulatory motifs in Shewanella

Crossref

UNT Digital Library

Sampled ensemble neutrality as a feature to classify potential structured RNAs

Author
Publication venue: BioMed Central
Publication date: 05/02/2015
Field of study

Springer - Publisher Connector

Computational identification of new structured cis-regulatory elements in the 3'-untranslated region of human protein coding genes

Author: Brown C.
Chen X.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2012
Field of study

Messenger ribonucleic acids (RNAs) contain a large number of cis-regulatory RNA elements that function in many types of post-transcriptional regulation. These cis-regulatory elements are often characterized by conserved structures and/or sequences. Although some classes are well known, given the wide range of RNA-interacting proteins in eukaryotes, it is likely that many new classes of cis-regulatory elements are yet to be discovered. An approach to this is to use computational methods that have the advantage of analysing genomic data, particularly comparative data on a large scale. In this study, a set of structural discovery algorithms was applied followed by support vector machine (SVM) classification. We trained a new classification model (CisRNA-SVM) on a set of known structured cis-regulatory elements from 3′-untranslated regions (UTRs) and successfully distinguished these and groups of cis-regulatory elements not been strained on from control genomic and shuffled sequences. The new method outperformed previous methods in classification of cis-regulatory RNA elements. This model was then used to predict new elements from cross-species conserved regions of human 3′-UTRs. Clustering of these elements identified new classes of potential cis-regulatory elements. The model, training and testing sets and novel human predictions are available at: http://mRNA.otago.ac.nz/CisRNA-SVM

PubMed Central

MPG.PuRe

Shewanella knowledgebase: integration of the experimental data and computational predictions suggests a biological role for transcription of intergenic regions

Author: Abreu-Goodger
Arnaud
Belostotsky
Berretta
Biffinger
Biffinger
Byung H. Park
Caspi
Charania
Choi
Chourey
Daddaoua
Denise D. Schmoyer
Driscoll
Edgar
Edward C. Uberbacher
Elias
Faith
Fredrickson
Gao
Gralnick
Griffiths-Jones
Grissa
Gupta
Guruprasad H. Kora
Hager
Hau
Hua
Iliopoulos
Karp
Karp
Karp
Karpinets
Kazakov
Keseler
Kolker
Lanthier
Liu
Margaret F. Romine
Margrethe H. Serres
Markowitz
Martens
Michael R. Leuze
Misra
Montie
Munch
Mustafa H. Syed
Nagiza F. Samatova
Okuda
Passalacqua
Perez
Peterson
Pinchuk
Ponce
Pruitt
Rawlings
Ren
Riley
Romine
Romine
Stein
Syed
Tatiana V. Karpinets
Tusher
Uchiyama
Waterhouse
Wilson
Winsor
Xu
Xu
Zhao
Publication venue: Oxford University Press
Publication date
Field of study

Shewanellae are facultative γ-proteobacteria whose remarkable respiratory versatility has resulted in interest in their utility for bioremediation of heavy metals and radionuclides and for energy generation in microbial fuel cells. Extensive experimental efforts over the last several years and the availability of 21 sequenced Shewanella genomes made it possible to collect and integrate a wealth of information on the genus into one public resource providing new avenues for making biological discoveries and for developing a system level understanding of the cellular processes. The Shewanella knowledgebase was established in 2005 to provide a framework for integrated genome-based studies on Shewanella ecophysiology. The present version of the knowledgebase provides access to a diverse set of experimental and genomic data along with tools for curation of genome annotations and visualization and integration of genomic data with experimental data. As a demonstration of the utility of this resource, we examined a single microarray data set from Shewanella oneidensis MR-1 for new insights into regulatory processes. The integrated analysis of the data predicted a new type of bacterial transcriptional regulation involving co-transcription of the intergenic region with the downstream gene and suggested a biological role for co-transcription that likely prevents the binding of a regulator of the upstream gene to the regulator binding site located in the intergenic region

Crossref

PubMed Central

Expanding the repertoire of bacterial (non-)coding RNAs

Author: Findeiß Sven
Publication venue
Publication date: 03/07/2011
Field of study

The detection of non-protein-coding RNA (ncRNA) genes in bacteria and their diverse regulatory mode of action moved the experimental and bio-computational analysis of ncRNAs into the focus of attention. Regulatory ncRNA transcripts are not translated to proteins but function directly on the RNA level. These typically small RNAs have been found to be involved in diverse processes such as (post-)transcriptional regulation and modification, translation, protein translocation, protein degradation and sequestration. Bacterial ncRNAs either arise from independent primary transcripts or their mature sequence is generated via processing from a precursor. Besides these autonomous transcripts, RNA regulators (e.g. riboswitches and RNA thermometers) also form chimera with protein-coding sequences. These structured regulatory elements are encoded within the messenger RNA and directly regulate the expression of their “host” gene. The quality and completeness of genome annotation is essential for all subsequent analyses. In contrast to protein-coding genes ncRNAs lack clear statistical signals on the sequence level. Thus, sophisticated tools have been developed to automatically identify ncRNA genes. Unfortunately, these tools are not part of generic genome annotation pipelines and therefore computational searches for known ncRNA genes are the starting point of each study. Moreover, prokaryotic genome annotation lacks essential features of protein-coding genes. Many known ncRNAs regulate translation via base-pairing to the 5’ UTR (untranslated region) of mRNA transcripts. Eukaryotic 5’ UTRs have been routinely annotated by sequencing of ESTs (expressed sequence tags) for more than a decade. Only recently, experimental setups have been developed to systematically identify these elements on a genome-wide scale in prokaryotes. The first part of this thesis, describes three experimental surveys of exploratory field studies to analyze transcript organization in pathogenic bacteria. To identify ncRNAs in Pseudomonas aeruginosa we used a combination of an experimental RNomics approach and ncRNA prediction. Besides already known ncRNAs we identified and validated the expression of six novel RNA genes. Global detection of transcripts by next generation RNA sequencing techniques unraveled an unexpectedly complex transcript organization in many bacteria. These ultra high-throughput methods give us the appealing opportunity to analyze the complete RNA output of any species at once. The development of the differential RNA sequencing (dRNA-seq) approach enabled us to analyze the primary transcriptome of Helicobacter pylori and Xanthomonas campestris. For the first time we generated a comprehensive and precise transcription start site (TSS) map for both species and provide a general framework for the analysis of dRNA-seq data. Focusing on computer-aided analysis we developed new tools to annotate TSS, detect small protein-coding genes and to infer homology of newly detected transcripts. We discovered hundreds of TSS in intergenic regions, upstream of protein-coding genes, within operons and antisense to annotated genes. Analysis of 5’ UTRs (spanning from the TSS to the start codon of the adjacent protein-coding gene) revealed an unexpected size diversity ranging from zero to several hundred nucleotides. We identified and validated the expression of about 60 and about 20 ncRNA candidates in Helicobacter and Xanthomonas, respectively. Among these ncRNA candidates we found several small protein-coding genes that have previously evaded annotation in both species. We showed that the combination of dRNA-seq and computational analysis is a powerful method to examine prokaryotic transcriptomes. Experimental setups are time consuming and often combined with huge costs. Another limitation of experimental approaches is that genes which are expressed in specific developmental stages or stress conditions are likely to be missed. Bioinformatic tools build an alternative to overcome such restraints. General approaches usually depend on comparative genomic data and evolutionary signatures are used to analyze the (non-)coding potential of multiple sequence alignments. In the second part of my thesis we present our major update of the widely used ncRNA gene finder RNAz and introduce RNAcode, an efficient tool to asses local protein-coding potential of genomic regions. RNAz has been successfully used to identify structured RNA elements in all domains of life. However, our own experience and the user feedback not only demonstrated the applicability of the RNAz approach, but also helped us to identify limitations of the current implementation. Using a much larger training set and a new classification model we significantly improved the prediction accuracy of RNAz. During transcriptome analysis we repeatedly identified small protein-coding genes that have not been annotated so far. Only a few of those genes are known to date and standard proteincoding gene finding tools suffer from the lack of training data. To avoid an excess of false positive predictions, gene finding software is usually run with an arbitrary cutoff of 40-50 amino acids and therefore misses the small sized protein-coding genes. We have implemented RNAcode which is optimized for emerging applications not covered by standard protein-coding gene annotation software. In addition to complementing classical protein gene annotation, a major field of application of RNAcode is the functional classification of transcribed regions. RNA sequencing analyses are likely to falsely report transcript fragments (e.g. mRNA degradation products) as non-coding. Hence, an evaluation of the protein-coding potential of these fragments is an essential task. RNAcode reports local regions of high coding potential instead of complete protein-coding genes. A training on known protein-coding sequences is not necessary and RNAcode can therefore be applied to any species. We showed this with our analysis of the Escherichia coli genome where the current annotation could be accurately reproduced. We furthermore identified novel small protein-coding genes with RNAcode in this extensively studied genome. Using transcriptome and proteome data we found compelling evidence that several of the identified candidates are bona fide proteins. In summary, this thesis clearly demonstrates that bioinformatic methods are mandatory to analyze the huge amount of transcriptome data and to identify novel (non-)coding RNA genes. With the major update of RNAz and the implementation of RNAcode we contributed to complete the repertoire of gene finding software which will help to unearth hidden treasures of the RNA World

Qucosa - Publikationsserver der Universität Leipzig

In vivo characterization of RNA cis-regulators in bacteria

Author: Babina Arianne M.
Publication venue: 'Boston College University Libraries'
Publication date: 01/01/2017
Field of study

Thesis advisor: Michelle M. MeyerBacteria commonly utilize cis-acting mRNA structures that bind specific molecules to control gene expression in response to changing cellular conditions. Examples of these ligand-sensing RNA cis-regulators are found throughout the bacterial world and include riboswitches, which interact with small metabolites to modulate the expression of fundamental metabolic genes, and the RNA structures that bind select ribosomal proteins to regulate entire ribosomal protein operons. Despite advances in both non-coding RNA discovery and validation, many predicted regulatory RNA motifs remain uncharacterized and little work has examined how RNA cis-regulators behave within their physiological context in the cell. Furthermore, it is not well understood how structured RNA regulators emerge and are maintained within bacterial genomes. In this thesis, I validate the biological function of a conserved RNA cis-regulator of ribosomal protein synthesis previously discovered by my group using bioinformatic approaches. I then investigate how bacteria respond to the loss of two different cis-regulatory RNA structures. Using Bacillus subtilis as a model organism, I introduce point mutations into the native loci of the ribosomal protein L20-interacting RNA cis-regulator and the tandem glycine riboswitch and assay the strains for fitness defects. I find that disrupting these regulatory RNA structures results in severe mutant phenotypes, especially under harsh conditions such as low temperatures or high glycine concentrations. Together, this body of work highlights the advantages of examining RNA behavior within its biological context and emphasizes the important role RNA cis-regulators play in overall organismal viability. My studies shed light on the selective pressures that impact structured RNA evolution in vivo and reinforce the potential of cis-regulatory RNAs as novel antimicrobial targets.Thesis (PhD) — Boston College, 2017.Submitted to: Boston College. Graduate School of Arts and Sciences.Discipline: Biology

eScholarship@BC