123 research outputs found

    Expanding the repertoire of bacterial (non-)coding RNAs

    Get PDF
    The detection of non-protein-coding RNA (ncRNA) genes in bacteria and their diverse regulatory mode of action moved the experimental and bio-computational analysis of ncRNAs into the focus of attention. Regulatory ncRNA transcripts are not translated to proteins but function directly on the RNA level. These typically small RNAs have been found to be involved in diverse processes such as (post-)transcriptional regulation and modification, translation, protein translocation, protein degradation and sequestration. Bacterial ncRNAs either arise from independent primary transcripts or their mature sequence is generated via processing from a precursor. Besides these autonomous transcripts, RNA regulators (e.g. riboswitches and RNA thermometers) also form chimera with protein-coding sequences. These structured regulatory elements are encoded within the messenger RNA and directly regulate the expression of their “host” gene. The quality and completeness of genome annotation is essential for all subsequent analyses. In contrast to protein-coding genes ncRNAs lack clear statistical signals on the sequence level. Thus, sophisticated tools have been developed to automatically identify ncRNA genes. Unfortunately, these tools are not part of generic genome annotation pipelines and therefore computational searches for known ncRNA genes are the starting point of each study. Moreover, prokaryotic genome annotation lacks essential features of protein-coding genes. Many known ncRNAs regulate translation via base-pairing to the 5’ UTR (untranslated region) of mRNA transcripts. Eukaryotic 5’ UTRs have been routinely annotated by sequencing of ESTs (expressed sequence tags) for more than a decade. Only recently, experimental setups have been developed to systematically identify these elements on a genome-wide scale in prokaryotes. The first part of this thesis, describes three experimental surveys of exploratory field studies to analyze transcript organization in pathogenic bacteria. To identify ncRNAs in Pseudomonas aeruginosa we used a combination of an experimental RNomics approach and ncRNA prediction. Besides already known ncRNAs we identified and validated the expression of six novel RNA genes. Global detection of transcripts by next generation RNA sequencing techniques unraveled an unexpectedly complex transcript organization in many bacteria. These ultra high-throughput methods give us the appealing opportunity to analyze the complete RNA output of any species at once. The development of the differential RNA sequencing (dRNA-seq) approach enabled us to analyze the primary transcriptome of Helicobacter pylori and Xanthomonas campestris. For the first time we generated a comprehensive and precise transcription start site (TSS) map for both species and provide a general framework for the analysis of dRNA-seq data. Focusing on computer-aided analysis we developed new tools to annotate TSS, detect small protein-coding genes and to infer homology of newly detected transcripts. We discovered hundreds of TSS in intergenic regions, upstream of protein-coding genes, within operons and antisense to annotated genes. Analysis of 5’ UTRs (spanning from the TSS to the start codon of the adjacent protein-coding gene) revealed an unexpected size diversity ranging from zero to several hundred nucleotides. We identified and validated the expression of about 60 and about 20 ncRNA candidates in Helicobacter and Xanthomonas, respectively. Among these ncRNA candidates we found several small protein-coding genes that have previously evaded annotation in both species. We showed that the combination of dRNA-seq and computational analysis is a powerful method to examine prokaryotic transcriptomes. Experimental setups are time consuming and often combined with huge costs. Another limitation of experimental approaches is that genes which are expressed in specific developmental stages or stress conditions are likely to be missed. Bioinformatic tools build an alternative to overcome such restraints. General approaches usually depend on comparative genomic data and evolutionary signatures are used to analyze the (non-)coding potential of multiple sequence alignments. In the second part of my thesis we present our major update of the widely used ncRNA gene finder RNAz and introduce RNAcode, an efficient tool to asses local protein-coding potential of genomic regions. RNAz has been successfully used to identify structured RNA elements in all domains of life. However, our own experience and the user feedback not only demonstrated the applicability of the RNAz approach, but also helped us to identify limitations of the current implementation. Using a much larger training set and a new classification model we significantly improved the prediction accuracy of RNAz. During transcriptome analysis we repeatedly identified small protein-coding genes that have not been annotated so far. Only a few of those genes are known to date and standard proteincoding gene finding tools suffer from the lack of training data. To avoid an excess of false positive predictions, gene finding software is usually run with an arbitrary cutoff of 40-50 amino acids and therefore misses the small sized protein-coding genes. We have implemented RNAcode which is optimized for emerging applications not covered by standard protein-coding gene annotation software. In addition to complementing classical protein gene annotation, a major field of application of RNAcode is the functional classification of transcribed regions. RNA sequencing analyses are likely to falsely report transcript fragments (e.g. mRNA degradation products) as non-coding. Hence, an evaluation of the protein-coding potential of these fragments is an essential task. RNAcode reports local regions of high coding potential instead of complete protein-coding genes. A training on known protein-coding sequences is not necessary and RNAcode can therefore be applied to any species. We showed this with our analysis of the Escherichia coli genome where the current annotation could be accurately reproduced. We furthermore identified novel small protein-coding genes with RNAcode in this extensively studied genome. Using transcriptome and proteome data we found compelling evidence that several of the identified candidates are bona fide proteins. In summary, this thesis clearly demonstrates that bioinformatic methods are mandatory to analyze the huge amount of transcriptome data and to identify novel (non-)coding RNA genes. With the major update of RNAz and the implementation of RNAcode we contributed to complete the repertoire of gene finding software which will help to unearth hidden treasures of the RNA World

    Investigating prokaryotic transcriptomes and the impact of crosstalk between noncoding RNA and messenger RNA interactions

    Get PDF
    Prokaryotes have a complex noncoding RNA (ncRNA) based regulatory system, resembling that of eukaryotes. Recent transcriptomics studies also point out the abundance of highly expressed uncharacterized RNAs in archaea and bacteria. However, despite the recent advances indicating the prevalence of ncRNAs in prokaryotes, it is still unknown to what extent these uncharacterized transcripts are functional. Therefore, we have proposed a phylogeny informed approach to design new RNA sequencing (RNAseq) experiments, which increases the information harnessed from transcriptome data for ncRNA detection. Many regulatory ncRNAs engage in RNARNA interactions, where RNA molecules bind to form a duplex. Predictions of true targets for an RNA enables a successful functional characterization, these can be estimated by bioinformatics methods. However, the algorithms developed to date are imperfect and it is an open question as to which ones perform well and whether these can be improved upon. Towards this goal we performed a computational benchmark study to find reliable algorithms for RNARNA interaction prediction. We found that energy based methods, which include the accessibility of interaction regions, are currently the most accurate. Many ncRNAs, including housekeeping ncRNA genes, are highly expressed. The abundances of interacting RNA molecules enable RNARNA duplex formation. In chapter IV we explore the impact of high abundance RNAs on protein expression due to crosstalk RNARNA interactions between mRNAs and ncRNAs. With extensive RNARNA interaction predictions we reveal that RNA avoidance is an evolutionarily conserved phenomenon among prokaryotes, which means that core mRNAs have evolved to avoid crosstalk interactions with abundant ncRNAs. Our predictions also reveal that RNA avoidance may influence protein expression. To test this, we investigated the stability of interactions between mRNAs and core ncRNAs. These predictions show that the RNA avoidance influences the final protein abundances. In conclusion, the primary aims of this study are to investigate the prokaryotic transcriptome for novel ncRNA genes and examine the effects of crosstalk RNA interactions. We present a method to increase information gained from transcriptome in prokaryotes for ncRNA identification. We also present the most comprehensive benchmark of RNARNA interaction prediction algorithms to date. Lastly, we introduce and test a ‘RNA avoidance hypothesis’ that shows the influence of crosstalk RNA interactions on protein expression in bacteria

    Small RNA targets : advances in prediction tools and high-throughput profiling

    Get PDF
    MicroRNAs (miRNAs) are an abundant class of small non-coding RNAs that regulate gene expression at the post-transcriptional level. They are suggested to be involved in most biological processes of the cell primarily by targeting messenger RNAs (mRNAs) for cleavage or translational repression. Their binding to their target sites is mediated by the Argonaute (AGO) family of proteins. Thus, miRNA target prediction is pivotal for research and clinical applications. Moreover, transfer-RNA-derived fragments (tRFs) and other types of small RNAs have been found to be potent regulators of Ago-mediated gene expression. Their role in mRNA regulation is still to be fully elucidated, and advancements in the computational prediction of their targets are in their infancy. To shed light on these complex RNA–RNA interactions, the availability of good quality high-throughput data and reliable computational methods is of utmost importance. Even though the arsenal of computational approaches in the field has been enriched in the last decade, there is still a degree of discrepancy between the results they yield. This review offers an overview of the relevant advancements in the field of bioinformatics and machine learning and summarizes the key strategies utilized for small RNA target prediction. Furthermore, we report the recent development of high-throughput sequencing technologies, and explore the role of non-miRNA AGO driver sequences.peer-reviewe

    A global RNA analysis of Neisseria gonorrhoeae in vitro and during human infection

    Full text link
    Thesis (Ph.D.)--Boston UniversityThe mucosal disease, gonorrhea, caused by the Gram-negative pathogen Neisseria gonorrhoeae, is estimated to have at least 700,000 cases annually in the United States and 62 million cases worldwide. A strict human pathogen, N. gonorrhoeae infects several mucosal sites throughout the body making proper gene regulation crucial. The goal of these studies was to define the global transcriptional response of N. gonorrhoeae during infection by analyzing its transcriptome during in vitro growth, during incubation with human epithelial cells, and during in vivo mucosal infection of the human female genital tract. Using RNA sequencing, we identified several new small RNA transcripts expressed in vitro that have the potential to regulate target mRNAs. Our studies were aided by the development of a novel computer program, Rockhopper, designed specifically for analysis of prokaryotic transcriptomes. Secondary methods were used to corroborate a strong correlation between Rockhopper analysis and N. gonorrhoeae transcriptional start sites, operon structures and gene expression levels. We also utilized Rockhopper to analyze the gonococcal transcriptome expressed during incubation with a human endocervical cell line. During such incubation, N. gonorrhoeae was demonstrated to regulate a large number of stress response and respiratory genes. Corresponding analysis of host cells during incubation with N. gonorrhoeae revealed increased expression of host pathways involved in innate immunity, adaptive immunity, cancer and apoptosis. Finally, analysis of gonococcal RNA from four vaginal lavage samples of female patients exposed to partners infected with N. gonorrhoeae was performed. This analysis demonstrated a similar profile of gonococcal stress response genes compared to incubation with epithelial cells. In addition, several novel sRNAs expressed by the gonococcus only during in vivo infection were also identified. Analysis of the same vaginal lavage samples demonstrated that a number of human genes involved in immune pathways and cancer are expressed during mucosal gonococcal infection. These studies are the first to analyze gene regulation in N. gonorrhoeae globally during infection and greatly expand our knowledge of how the host and pathogen respond to infection. Furthermore, they have the potential to aid in the development of novel antibacterial therapeutics or new vaccine targets for this disease

    Genome-wide transcription start site mapping of Bradyrhizobium japonicum grown free-living or in symbiosis – a rich resource to identify new transcripts, proteins and to study gene regulation

    Get PDF
    Background: Differential RNA-sequencing (dRNA-seq) is indispensable for determination of primary transcriptomes. However, using dRNA-seq data to map transcriptional start sites (TSSs) and promoters genome-wide is a bioinformatics challenge. We performed dRNA-seq of Bradyrhizobium japonicum USDA 110, the nitrogen-fixing symbiont of soybean, and developed algorithms to map TSSs and promoters. Results: A specialized machine learning procedure for TSS recognition allowed us to map 15,923 TSSs: 14,360 in free-living bacteria, 4329 in symbiosis with soybean and 2766 in both conditions. Further, we provide proteomic evidence for 4090 proteins, among them 107 proteins corresponding to new genes and 178 proteins with N-termini different from the existing annotation (72 and 109 of them with TSS support, respectively). Guided by proteomics evidence, previously identified TSSs and TSSs experimentally validated here, we assign a score threshold to flag 14 % of the mapped TSSs as a class of lower confidence. However, this class of lower confidence contains valid TSSs of low-abundant transcripts. Moreover, we developed a de novo algorithm to identify promoter motifs upstream of mapped TSSs, which is publicly available, and found motifs mainly used in symbiosis (similar to RpoN-dependent promoters) or under both conditions (similar to RpoD-dependent promoters). Mapped TSSs and putative promoters, proteomic evidence and updated gene annotation were combined into an annotation file. Conclusions: The genome-wide TSS and promoter maps along with the extended genome annotation of B. japonicum represent a valuable resource for future systems biology studies and for detailed analyses of individual non-coding transcripts and ORFs. Our data will also provide new insights into bacterial gene regulation during the agriculturally important symbiosis between rhizobia and legumes

    Regulatory role of small RNAs and RNA-binding proteins in carbon metabolism and collective behaviour of Vibrio cholerae

    Get PDF
    The importance of small regulatory RNAs (sRNAs) has been recognized across all domains of life. Originally considered “non-coding RNAs,” several bacterial sRNAs have been found to encode functional proteins that are under 50 amino acids long. This group of regulators are called dual-function regulators. To date, only five such regulators have been characterized in bacteria. In the primary study, the first dual-function RNA of Vibrio cholerae was discovered and characterized. The pathogen colonizes and infects the upper intestines by producing two key virulence determinants – toxin co-regulated pilus (TCP) and cholera toxin (CT). While all the known sRNAs of V. cholerae act directly or indirectly to regulate the production of TCP, the sRNA VqmR is the only known direct repressor of CT production to date. Therefore, a forward genetic screen was employed to score for CT repression. This screen identified another promising candidate called Vcr082. Interestingly, Vcr082 also encodes 29 amino acids long ORF and hence was re-named VcdRP, for V. cholerae dual RNA regulator and protein, eponymous to their roles. The dual regulator is controlled by the global transcription factor of carbon utilization, cAMP-CRP. The riboregulatory component is conserved at the 3’ end of the dual regulator. By employing a conserved stretch of four cytosines, VcdR base-pairs with and represses mRNAs that encode for transporters that import PTS sugars. Additionally, VcdR also downregulates the phosphor-carrier proteins PtsH and PtsI that are involved in the phospho-relay during glycolysis. The small protein, VcdP exerts its regulatory role by interacting with and accelerating the activity of citrate synthase enzyme, opening the gateway into the TCA cycle. This way, both VcdR and VcdP act to block sugar uptake and modulate the flux through the TCA cycle, thereby striking a balance to maintain overall carbon metabolism in V. cholerae. The diverse environments that V. cholerae inhabits necessitates that the organism rapidly perceives changes in its external environment and appropriately tailors its gene expression paradigm. To achieve this, the bacteria employ quorum sensing (QS) to communicate and coordinate a suitable response. While this mechanism of census taking has been well-documented early on in several marine bacteria, more recent studies have identified additional QS systems in V. cholerae. Similarly, while biofilm formation has been extensively studied, the transition into and subsequent dispersal was only documented recently. These incomplete underpinnings thereby prompted further investigation of the QS pathway. Therefore, in the second study, a forward genetic screen in a V. cholerae mutant library was employed to score for an altered QS phenotypic transition. This screen identified a novel RNA-binding protein called MbrA (membrane-bound RNA-binding protein A). This protein localizes to the membrane and contains two trans-membrane domains at the N-terminus and a conserved RNA recognition motif-type RNA-binding domain located towards the C-terminus. MbrA is activated by the global transcription factor cAMP-CRP and a subsequent transcriptome analysis revealed its role in the regulation of motility genes and flagellar assembly complex in V. cholerae

    Técnicas avançadas de biologia molecular e transcritómica aplicadas a actinomycetes

    Get PDF
    Nowadays, the resistance to antibiotics is turning to be one of the biggest concerns of this century. It is risking our public health and might be, in the future, one of the deadliest health issues. Therefore, a solution is needed. Streptomyces is known as a genus that is responsible for many antibiotics. However, in the latest years, the number of antibiotics that reached the clinical use diminished, due to various reasons. The secondary metabolism is composed by a number of processes that, even though aren’t of extreme importance to the cell survival, can give it several advantages. Most of those advantages come as secondary metabolites, and some are known as antibiotics. On the other hand, phosphorus is one of the most important elements to any organism, and, therefore, a mechanism is needed to make sure that this element is regulated. One of the systems that does that depends on PhoR-PhoP. With the discovery of a new type of RNAs, the sRNAs (which have between 50 and 400 nucleotides) the investigation of new compounds with pharmaceutical and industrial importance may continue to go forward, since some of those molecules may function as regulators of the secondary metabolism and, therefore, be related to antibiotics. The aim of this work is to identify sRNAs that are implicated in the phosphate regulation, which is the main assimilation form of phosphorus. My objective in this study was to experience what was like to work in a laboratory, by learning some molecular biology and transcriptomic techniques, such as DNA introduction into cells and nucleic acid extraction, needed for this study.Hoje em dia, a resistência a antibióticos está a tornar-se numa das maiores preocupações do século. Esta problemática está a arriscar a nossa saúde pública e pode vir a tornar-se, no futuro, num dos problemas de saúde mais letais. Portanto, uma solução é necessária. Streptomyces é conhecido como o género que é responsável por imensos antibióticos. Porém, nestes últimos anos, o número de antibióticos que chegaram a uso clínico diminuiu, devido a várias razões. Metabolismo secundário é composto de vários processos que, ainda que não sejam fundamentais para a sobrevivência da célula, dão-lhe imensas vantagens. A maior parte destas vantagens originam-se a partir dos metabolitos secundários, alguns deles conhecidos como antibióticos. Por outro lado, fósforo é um dos elementos mais importantes para qualquer organismo e, por isso, necessita de um mecanismo que se assegure que ele está sempre regulado. Um destes sistemas depende de PhoR-PhoP. Com a descoberta de um novo tipo de RNAs, os sRNAs (que têm entre 50 a 400 nucleótidos), a investigação de novos compostos com importância farmacêutica e industrial pode continuar a avançar, dado que algumas destas moléculas podem funcionar como reguladoras do metabolismo secundário e, por isso, estar relacionadas com antibióticos. O objetivo deste trabalho é identificar sRNAs que estão implicados na regulação de fosfato, a principal forma de assimilação de fósforo. A minha função era experienciar o que era trabalhar num laboratório, ao aprender algumas técnicas de biologia molecular e transcritómica, como introdução de DNA em células e extração de ácidos nucleicos, necessárias para este projeto.Mestrado em Biologia Molecular e Celula

    nocoRNAc: Characterization of non-coding RNAs in prokaryotes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The interest in non-coding RNAs (ncRNAs) constantly rose during the past few years because of the wide spectrum of biological processes in which they are involved. This led to the discovery of numerous ncRNA genes across many species. However, for most organisms the non-coding transcriptome still remains unexplored to a great extent. Various experimental techniques for the identification of ncRNA transcripts are available, but as these methods are costly and time-consuming, there is a need for computational methods that allow the detection of functional RNAs in complete genomes in order to suggest elements for further experiments. Several programs for the genome-wide prediction of functional RNAs have been developed but most of them predict a genomic locus with no indication whether the element is transcribed or not.</p> <p>Results</p> <p>We present <smcaps>NOCO</smcaps>RNAc, a program for the genome-wide prediction of ncRNA transcripts in bacteria. <smcaps>NOCO</smcaps>RNAc incorporates various procedures for the detection of transcriptional features which are then integrated with functional ncRNA loci to determine the transcript coordinates. We applied RNAz and <smcaps>NOCO</smcaps>RNAc to the genome of <it>Streptomyces coelicolor </it>and detected more than 800 putative ncRNA transcripts most of them located antisense to protein-coding regions. Using a custom design microarray we profiled the expression of about 400 of these elements and found more than 300 to be transcribed, 38 of them are predicted novel ncRNA genes in intergenic regions. The expression patterns of many ncRNAs are similarly complex as those of the protein-coding genes, in particular many antisense ncRNAs show a high expression correlation with their protein-coding partner.</p> <p>Conclusions</p> <p>We have developed <smcaps>NOCO</smcaps>RNAc, a framework that facilitates the automated characterization of functional ncRNAs. <smcaps>NOCO</smcaps>RNAc increases the confidence of predicted ncRNA loci, especially if they contain transcribed ncRNAs. <smcaps>NOCO</smcaps>RNAc is not restricted to intergenic regions, but it is applicable to the prediction of ncRNA transcripts in whole microbial genomes. The software as well as a user guide and example data is available at <url>http://www.zbit.uni-tuebingen.de/pas/nocornac.htm</url>.</p

    Small Regulatory RNA and Legionella pneumophila

    Get PDF
    Legionella pneumophila is a gram-negative bacterial species that is ubiquitous in almost any aqueous environment. It is the agent of Legionnaires’ disease, an acute and often under-reported form of pneumonia. In mammals, L. pneumophila replicates inside macrophages within a modified vacuole. Many protein regulators have been identified that control virulence-related properties, including RpoS, LetA/LetS, and PmrA/PmrB. In the past few years, the importance of regulation of virulence factors by small regulatory RNA (sRNAs) has been increasingly appreciated. This is also the case in L. pneumophila where three sRNAs (RsmY, RsmZ, and 6S RNA) were recently shown to be important determinants of virulence regulation and 79 actively transcribed sRNAs were identified. In this review we describe current knowledge about sRNAs and their regulatory properties and how this relates to the known regulatory systems of L. pneumophila. We also provide a model for sRNA-mediated control of gene expression that serves as a framework for understanding the regulation of virulence-related properties of L. pneumophila
    corecore