70 research outputs found

    Conservation of transcriptional sensing systems in prokaryotes: A perspective from Escherichia coli

    Get PDF
    The activity of transcription factors is usually governed by allosteric physicochemical signals or metabolites, which are in turn produced in the cell or obtained from the environment by the activity of the products of effector genes. Previously, we identified a collection of more than 110 transcription factors and their corresponding effector genes in Escherichia coli K-12. Here, we introduce the notion of “triferog”, which relates to the identification of orthologous transcription factors and effector genes across genomes and show that transcriptional sensing systems known in E. coli are poorly conserved beyond Salmonella. We also find that enzymes that act as effector genes for the production of endogenous effector metabolites are more conserved than their corresponding effector genes encoding for transport and two-component systems for sensing exogenous signals. Finally, we observe that on an evolutionary scale enzymes are more conserved than their respective TFs, suggesting a homogenous cellular metabolism across genomes and the conservation of transcriptional control of critical cellular processes like DNA replication by a common endogenous signal. We hypothesize that extensive variation in the domain architecture of TFs and changes in endogenous conditions at large phylogenetic distances could be the major contributing factors for the observed differential conservation of TFs and their corresponding effector genes encoding for enzymes, causing variations in transcriptional responses across organisms

    Information on Transcriptional Regulation and Signal Transduction of _Escherichia coli_ K-12 Integrated in the Database RegulonDB.

    Get PDF
    Since its inception, RegulonDB ("http://regulondb.ccg.unam.mx/":http://regulondb.ccg.unam.mx/) has been a database that compiles information about the regulation of transcription initiation of _Escherichia coli_ K-12. However, we are aware that transcriptional regulation is not an isolated process; instead, it is the response to the different environmental conditions that trigger a series of concatenated reactions that end in transcriptional regulation, and it implies an adequate response in terms of induced and repressed gene products. We are working now to include all these new data in RegulonDB. As a consequence, transcriptional regulation in RegulonDB will be part of a unit that initiates with the signal, continues with the signal transduction to the core of regulation to modify expression of the affected set of target genes, and ends with an adequate response. We refer to these units as genetic sensory response units, or Gensor Units.

The inclusion of Gensor Units will bring a dramatic change and expansion of RegulonDB, due to the fact that we will be adding several new types of reactions and interactions. We started to collect data about signal transduction of the sigma factors, the two-component systems, of some transcription factors involved in carbon source utilization, and of genes involved in the synthesis of amino acids. We plan a high-level curation with super-pathways summarizing concatenated sets of reactions linked to those other databases that curate such information, while enabling with RegulonDB a compilation of complete Gensor Units.

In addition, the number of DNA binding sites for some transcription factors has grown considerably, and therefore we decided to review systematically those sites whose lengths ranging from 40 to 60 bp with orientation and consensus sequences that are not easy to identify. The current version of RegulonDB is the beginning of a higher-level curation of gene regulation information, and eventually our database will include all regulatory mechanisms and their regulated genes. 
&#xa

    Information on Transcriptional Regulation and Signal Transduction of _Escherichia coli_ K-12 Integrated in the Database RegulonDB.

    Get PDF
    Since its inception, RegulonDB ("http://regulondb.ccg.unam.mx/":http://regulondb.ccg.unam.mx/) has been a database that compiles information about the regulation of transcription initiation of _Escherichia coli_ K-12. However, we are aware that transcriptional regulation is not an isolated process; instead, it is the response to the different environmental conditions that trigger a series of concatenated reactions that end in transcriptional regulation, and it implies an adequate response in terms of induced and repressed gene products. We are working now to include all these new data in RegulonDB. As a consequence, transcriptional regulation in RegulonDB will be part of a unit that initiates with the signal, continues with the signal transduction to the core of regulation to modify expression of the affected set of target genes, and ends with an adequate response. We refer to these units as genetic sensory response units, or geSorgans.

The inclusion of geSorgans will bring a dramatic change and expansion of RegulonDB, due to the fact that we will be adding several new types of reactions and interactions. We started to collect data about signal transduction of the sigma factors, the two-component systems, of some transcription factors involved in carbon source utilization, and of genes involved in the synthesis of amino acids. We plan a high-level curation with super-pathways summarizing concatenated sets of reactions linked to those other databases that curate such information, while enabling with RegulonDB a compilation of complete geSorgans.

In addition, the number of DNA binding sites for some transcription factors has grown considerably, and therefore we decided to review systematically those sites whose lengths ranging from 40 to 60 bp with orientation and consensus sequences that are not easy to identify. The current version of RegulonDB is the beginning of a higher-level curation of gene regulation information, and eventually our database will include all regulatory mechanisms and their regulated genes. 
&#xa

    Theoretical and empirical quality assessment of transcription factor-binding motifs

    Get PDF
    Position-specific scoring matrices (PSSMs) are routinely used to predict transcription factor (TF)-binding sites in genome sequences. However, their reliability to predict novel binding sites can be far from optimum, due to the use of a small number of training sites or the inappropriate choice of parameters when building the matrix or when scanning sequences with it. Measures of matrix quality such as E-value and information content rely on theoretical models, and may fail in the context of full genome sequences. We propose a method, implemented in the program ‘matrix-quality’, that combines theoretical and empirical score distributions to assess reliability of PSSMs for predicting TF-binding sites. We applied ‘matrix-quality’ to estimate the predictive capacity of matrices for bacterial, yeast and mouse TFs. The evaluation of matrices from RegulonDB revealed some poorly predictive motifs, and allowed us to quantify the improvements obtained by applying multi-genome motif discovery. Interestingly, the method reveals differences between global and specific regulators. It also highlights the enrichment of binding sites in sequence sets obtained from high-throughput ChIP-chip (bacterial and yeast TFs), and ChIP–seq and experiments (mouse TFs). The method presented here has many applications, including: selecting reliable motifs before scanning sequences; improving motif collections in TFs databases; evaluating motifs discovered using high-throughput data sets

    The comprehensive updated regulatory network of Escherichia coli K-12

    Get PDF
    BACKGROUND: Escherichia coli is the model organism for which our knowledge of its regulatory network is the most extensive. Over the last few years, our project has been collecting and curating the literature concerning E. coli transcription initiation and operons, providing in both the RegulonDB and EcoCyc databases the largest electronically encoded network available. A paper published recently by Ma et al. (2004) showed several differences in the versions of the network present in these two databases. Discrepancies have been corrected, annotations from this and other groups (Shen-Orr et al., 2002) have been added, making the RegulonDB and EcoCyc databases the largest comprehensive and constantly curated regulatory network of E. coli K-12. RESULTS: Several groups have been using these curated data as part of their bioinformatics and systems biology projects, in combination with external data obtained from other sources, thus enlarging the dataset initially obtained from either RegulonDB or EcoCyc of the E. coli K12 regulatory network. We kindly obtained from the groups of Uri Alon and Hong-Wu Ma the interactions they have added to enrich their public versions of the E. coli regulatory network. These were used to search for original references and curate them with the same standards we use regularly, adding in several cases the original references (instead of reviews or missing references), as well as adding the corresponding experimental evidence codes. We also corrected all discrepancies in the two databases available as explained below. CONCLUSION: One hundred and fifty new interactions have been added to our databases as a result of this specific curation effort, in addition to those added as a result of our continuous curation work. RegulonDB gene names are now based on those of EcoCyc to avoid confusion due to gene names and synonyms, and the public releases of RegulonDB and EcoCyc are henceforth synchronized to avoid confusion due to different versions. Public flat files are available providing direct access to the regulatory network interactions thus avoiding errors due to differences in database modelling and representation. The regulatory network available in RegulonDB and EcoCyc is the most comprehensive and regularly updated electronically-encoded regulatory network of E. coli K-12

    Immunity related genes in dipterans share common enrichment of AT-rich motifs in their 5' regulatory regions that are potentially involved in nucleosome formation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Understanding the transcriptional regulation mechanisms in response to environmental challenges is of fundamental importance in biology. Transcription factors associated to response elements and the chromatin structure had proven to play important roles in gene expression regulation. We have analyzed promoter regions of dipteran genes induced in response to immune challenge, in search for particular sequence patterns involved in their transcriptional regulation.</p> <p>Results</p> <p>5' upstream regions of <it>D. melanogaster </it>and <it>A. gambiae </it>immunity-induced genes and their corresponding orthologous genes in 11 non-melanogaster drosophilid species and <it>Ae. aegypti </it>share enrichment in AT-rich short motifs. AT-rich motifs are associated with nucleosome formation as predicted by two different algorithms. In <it>A. gambiae </it>and <it>D. melanogaster</it>, many immunity genes 5' upstream sequences also showed NFκB response elements, located within 500 bp from the transcription start site. In <it>A. gambiae</it>, the frequency of ATAA motif near the NFκB response elements was increased, suggesting a functional link between nucleosome formation/remodelling and NFκB regulation of transcription.</p> <p>Conclusion</p> <p>AT-rich motif enrichment in 5' upstream sequences in <it>A. gambiae, Ae. aegypti </it>and the <it>Drosophila </it>genus immunity genes suggests a particular pattern of nucleosome formation/chromatin organization. The co-occurrence of such motifs with the NFκB response elements suggests that these sequence signatures may be functionally involved in transcriptional activation during dipteran immune response. AT-rich motif enrichment in regulatory regions in this group of co-regulated genes could represent an evolutionary constrained signature in dipterans and perhaps other distantly species.</p

    RegulonDB (version 5.0): Escherichia coli K-12 transcriptional regulatory network, operon organization, and growth conditions

    Get PDF
    RegulonDB is the internationally recognized reference database of Escherichia coli K-12 offering curated knowledge of the regulatory network and operon organization. It is currently the largest electronically-encoded database of the regulatory network of any free-living organism. We present here the recently launched RegulonDB version 5.0 radically different in content, interface design and capabilities. Continuous curation of original scientific literature provides the evidence behind every single object and feature. This knowledge is complemented with comprehensive computational predictions across the complete genome. Literature-based and predicted data are clearly distinguished in the database. Starting with this version, RegulonDB public releases are synchronized with those of EcoCyc since our curation supports both databases. The complex biology of regulation is simplified in a navigation scheme based on three major streams: genes, operons and regulons. Regulatory knowledge is directly available in every navigation step. Displays combine graphic and textual information and are organized allowing different levels of detail and biological context. This knowledge is the backbone of an integrated system for the graphic display of the network, graphic and tabular microarray comparisons with curated and predicted objects, as well as predictions across bacterial genomes, and predicted networks of functionally related gene products. Access RegulonDB at

    Automatic reconstruction of a bacterial regulatory network using Natural Language Processing

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Manual curation of biological databases, an expensive and labor-intensive process, is essential for high quality integrated data. In this paper we report the implementation of a state-of-the-art Natural Language Processing system that creates computer-readable networks of regulatory interactions directly from different collections of abstracts and full-text papers. Our major aim is to understand how automatic annotation using Text-Mining techniques can complement manual curation of biological databases. We implemented a rule-based system to generate networks from different sets of documents dealing with regulation in <it>Escherichia coli </it>K-12.</p> <p>Results</p> <p>Performance evaluation is based on the most comprehensive transcriptional regulation database for any organism, the manually-curated RegulonDB, 45% of which we were able to recreate automatically. From our automated analysis we were also able to find some new interactions from papers not already curated, or that were missed in the manual filtering and review of the literature. We also put forward a novel Regulatory Interaction Markup Language better suited than SBML for simultaneously representing data of interest for biologists and text miners.</p> <p>Conclusion</p> <p>Manual curation of the output of automatic processing of text is a good way to complement a more detailed review of the literature, either for validating the results of what has been already annotated, or for discovering facts and information that might have been overlooked at the triage or curation stages.</p

    Coordination logic of the sensing machinery in the transcriptional regulatory network of Escherichia coli

    Get PDF
    The active and inactive state of transcription factors in growing cells is usually directed by allosteric physicochemical signals or metabolites, which are in turn either produced in the cell or obtained from the environment by the activity of the products of effector genes. To understand the regulatory dynamics and to improve our knowledge about how transcription factors (TFs) respond to endogenous and exogenous signals in the bacterial model, Escherichia coli, we previously proposed to classify TFs into external, internal and hybrid sensing classes depending on the source of their allosteric or equivalent metabolite. Here we analyze how a cell uses its topological structures in the context of sensing machinery and show that, while feed forward loops (FFLs) tightly integrate internal and external sensing TFs connecting TFs from different layers of the hierarchical transcriptional regulatory network (TRN), bifan motifs frequently connect TFs belonging to the same sensing class and could act as a bridge between TFs originating from the same level in the hierarchy. We observe that modules identified in the regulatory network of E. coli are heterogeneous in sensing context with a clear combination of internal and external sensing categories depending on the physiological role played by the module. We also note that propensity of two-component response regulators increases at promoters, as the number of TFs regulating a target operon increases. Finally we show that evolutionary families of TFs do not show a tendency to preserve their sensing abilities. Our results provide a detailed panorama of the topological structures of E. coli TRN and the way TFs they compose off, sense their surroundings by coordinating responses
    corecore