235 research outputs found

    Information content based model for the topological properties of the gene regulatory network of Escherichia coli

    Full text link
    Gene regulatory networks (GRN) are being studied with increasingly precise quantitative tools and can provide a testing ground for ideas regarding the emergence and evolution of complex biological networks. We analyze the global statistical properties of the transcriptional regulatory network of the prokaryote Escherichia coli, identifying each operon with a node of the network. We propose a null model for this network using the content-based approach applied earlier to the eukaryote Saccharomyces cerevisiae. (Balcan et al., 2007) Random sequences that represent promoter regions and binding sequences are associated with the nodes. The length distributions of these sequences are extracted from the relevant databases. The network is constructed by testing for the occurrence of binding sequences within the promoter regions. The ensemble of emergent networks yields an exponentially decaying in-degree distribution and a putative power law dependence for the out-degree distribution with a flat tail, in agreement with the data. The clustering coefficient, degree-degree correlation, rich club coefficient and k-core visualization all agree qualitatively with the empirical network to an extent not yet achieved by any other computational model, to our knowledge. The significant statistical differences can point the way to further research into non-adaptive and adaptive processes in the evolution of the E. coli GRN.Comment: 58 pages, 3 tables, 22 figures. In press, Journal of Theoretical Biology (2009)

    RegulonDB (version 5.0): Escherichia coli K-12 transcriptional regulatory network, operon organization, and growth conditions

    Get PDF
    RegulonDB is the internationally recognized reference database of Escherichia coli K-12 offering curated knowledge of the regulatory network and operon organization. It is currently the largest electronically-encoded database of the regulatory network of any free-living organism. We present here the recently launched RegulonDB version 5.0 radically different in content, interface design and capabilities. Continuous curation of original scientific literature provides the evidence behind every single object and feature. This knowledge is complemented with comprehensive computational predictions across the complete genome. Literature-based and predicted data are clearly distinguished in the database. Starting with this version, RegulonDB public releases are synchronized with those of EcoCyc since our curation supports both databases. The complex biology of regulation is simplified in a navigation scheme based on three major streams: genes, operons and regulons. Regulatory knowledge is directly available in every navigation step. Displays combine graphic and textual information and are organized allowing different levels of detail and biological context. This knowledge is the backbone of an integrated system for the graphic display of the network, graphic and tabular microarray comparisons with curated and predicted objects, as well as predictions across bacterial genomes, and predicted networks of functionally related gene products. Access RegulonDB at

    CoryneRegNet 6.0—Updated database content, new analysis methods and novel features focusing on community demands

    Get PDF
    Post-genomic analysis techniques such as next-generation sequencing have produced vast amounts of data about micro organisms including genetic sequences, their functional annotations and gene regulatory interactions. The latter are genetic mechanisms that control a cell's characteristics, for instance, pathogenicity as well as survival and reproduction strategies. CoryneRegNet is the reference database and analysis platform for corynebacterial gene regulatory networks. In this article we introduce the updated version 6.0 of CoryneRegNet and describe the updated database content which includes, 6352 corynebacterial regulatory interactions compared with 4928 interactions in release 5.0 and 3235 regulations in release 4.0, respectively. We also demonstrate how we support the community by integrating analysis and visualization features for transiently imported custom data, such as gene regulatory interactions. Furthermore, with release 6.0, we provide easy-to-use functions that allow the user to submit data for persistent storage with the CoryneRegNet database. Thus, it offers important options to its users in terms of community demands. CoryneRegNet is publicly available at http://www.coryneregnet.de

    Co-Regulation of Metabolic Genes Is Better Explained by Flux Coupling Than by Network Distance

    Get PDF
    To what extent can modes of gene regulation be explained by systems-level properties of metabolic networks? Prior studies on co-regulation of metabolic genes have mainly focused on graph-theoretical features of metabolic networks and demonstrated a decreasing level of co-expression with increasing network distance, a naïve, but widely used, topological index. Others have suggested that static graph representations can poorly capture dynamic functional associations, e.g., in the form of dependence of metabolic fluxes across genes in the network. Here, we systematically tested the relative importance of metabolic flux coupling and network position on gene co-regulation, using a genome-scale metabolic model of Escherichia coli. After validating the computational method with empirical data on flux correlations, we confirm that genes coupled by their enzymatic fluxes not only show similar expression patterns, but also share transcriptional regulators and frequently reside in the same operon. In contrast, we demonstrate that network distance per se has relatively minor influence on gene co-regulation. Moreover, the type of flux coupling can explain refined properties of the regulatory network that are ignored by simple graph-theoretical indices. Our results underline the importance of studying functional states of cellular networks to define physiologically relevant associations between genes and should stimulate future developments of novel functional genomic tools

    Low Degree Metabolites Explain Essential Reactions and Enhance Modularity in Biological Networks

    Get PDF
    Recently there has been a lot of interest in identifying modules at the level of genetic and metabolic networks of organisms, as well as in identifying single genes and reactions that are essential for the organism. A goal of computational and systems biology is to go beyond identification towards an explanation of specific modules and essential genes and reactions in terms of specific structural or evolutionary constraints. In the metabolic networks of E. coli, S. cerevisiae and S. aureus, we identified metabolites with a low degree of connectivity, particularly those that are produced and/or consumed in just a single reaction. Using FBA we also determined reactions essential for growth in these metabolic networks. We find that most reactions identified as essential in these networks turn out to be those involving the production or consumption of low degree metabolites. Applying graph theoretic methods to these metabolic networks, we identified connected clusters of these low degree metabolites. The genes involved in several operons in E. coli are correctly predicted as those of enzymes catalyzing the reactions of these clusters. We independently identified clusters of reactions whose fluxes are perfectly correlated. We find that the composition of the latter `functional clusters' is also largely explained in terms of clusters of low degree metabolites in each of these organisms. Our findings mean that most metabolic reactions that are essential can be tagged by one or more low degree metabolites. Those reactions are essential because they are the only ways of producing or consuming their respective tagged metabolites. Furthermore, reactions whose fluxes are strongly correlated can be thought of as `glued together' by these low degree metabolites.Comment: 12 pages main text with 2 figures and 2 tables. 16 pages of Supplementary material. Revised version has title changed and contains study of 3 organisms instead of 1 earlie

    BClass: A Bayesian Approach Based on Mixture Models for Clustering and Classification of Heterogeneous Biological Data

    Get PDF
    Based on mixture models, we present a Bayesian method (called BClass) to classify biological entities (e.g. genes) when variables of quite heterogeneous nature are analyzed. Various statistical distributions are used to model the continuous/categorical data commonly produced by genetic experiments and large-scale genomic projects. We calculate the posterior probability of each entry to belong to each element (group) in the mixture. In this way, an original set of heterogeneous variables is transformed into a set of purely homogeneous characteristics represented by the probabilities of each entry to belong to the groups. The number of groups in the analysis is controlled dynamically by rendering the groups as 'alive' and 'dormant' depending upon the number of entities classified within them. Using standard Metropolis-Hastings and Gibbs sampling algorithms, we constructed a sampler to approximate posterior moments and grouping probabilities. Since this method does not require the definition of similarity measures, it is especially suitable for data mining and knowledge discovery in biological databases. We applied BClass to classify genes in RegulonDB, a database specialized in information about the transcriptional regulation of gene expression in the bacterium Escherichia coli. The classification obtained is consistent with current knowledge and allowed prediction of missing values for a number of genes. BClass is object-oriented and fully programmed in Lisp-Stat. The output grouping probabilities are analyzed and interpreted using graphical (dynamically linked plots) and query-based approaches. We discuss the advantages of using Lisp-Stat as a programming language as well as the problems we faced when the data volume increased exponentially due to the ever-growing number of genomic projects.

    Characterization of relationships between transcriptional units and operon structures in Bacillus subtilis and Escherichia coli

    Get PDF
    BACKGROUND: Operon structures play an important role in transcriptional regulation in prokaryotes. However, there have been fewer studies on complicated operon structures in which the transcriptional units vary with changing environmental conditions. Information about such complicated operons is helpful for predicting and analyzing operon structures, as well as understanding gene functions and transcriptional regulation. RESULTS: We systematically analyzed the experimentally verified transcriptional units (TUs) in Bacillus subtilis and Escherichia coli obtained from ODB and RegulonDB. To understand the relationships between TUs and operons, we defined a new classification system for adjacent gene pairs, divided into three groups according to the level of gene co-regulation: operon pairs (OP) belong to the same TU, sub-operon pairs (SOP) that are at the transcriptional boundaries within an operon, and non-operon pairs (NOP) belonging to different operons. Consequently, we found that the levels of gene co-regulation was correlated to intergenic distances and gene expression levels. Additional analysis revealed that they were also correlated to the levels of conservation across about 200 prokaryotic genomes. Most interestingly, we found that functional associations in SOPs were more observed in the environmental and genetic information processes. CONCLUSION: Complicated operon strucutures were correlated with genome organization and gene expression profiles. Such intricately regulated operons allow functional differences depending on environmental conditions. These regulatory mechanisms are helpful in accommodating the variety of changes that happen around the cell. In addition, such differences may play an important role in the evolution of gene order across genomes

    Self-organization of gene regulatory network motifs enriched with short transcript's half-life transcription factors

    Full text link
    Network motifs, the recurring regulatory structural patterns in networks, are able to self-organize to produce networks. Three major motifs, feedforward loop, single input modules and bi-fan are found in gene regulatory networks. The large ratio of genes to transcription factors (TFs) in genomes leads to a sharing of TFs by motifs and is sufficient to result in network self-organization. We find a common design principle of these motifs: short transcript's half-life (THL) TFs are significantly enriched in motifs and hubs. This enrichment becomes one of the driving forces for the emergence of the network scale-free topology and allows the network to quickly adapt to environmental changes. Most feedforward loops and bi-fans contain at least one short THL TF, which can be seen as a criterion for self-assembling these motifs. We have classified the motifs according to their short THL TF content. We show that the percentage of the different motif subtypes varies in different cellular conditions.Comment: Trends Genet (in press), main text 1, supplementary notes 1, 40 pages, 7 tables, 4 figs, minor modification

    Operon Prediction with Bayesian Classifiers

    Get PDF
    In this work, we present an approach to predicting transcription units based on Bayesian classifiers. The predictor uses publicly available data to train the classifier, such as genome sequence data from Genbank, expression values from microarray experiments, and a collection of experimentally verified transcription units. We have studied the importance of each of the data source on the performance of the predictor by developing three classifier models and evaluating their outcomes. The predictor was trained and validated on the E. coli genome, but can be extended to other organisms. Using the full Bayesian classifier, we were able to correctly identify 80% of gene pairs belonging to operons

    Methods for analysis of derivative strains from metabolic evolution experiments

    Get PDF
    One of the largest challenges in genomics studies is determining the relationship between genotype and phenotype and then applying this knowledge to design principles. Metabolic engineering of bacteria can introduce targeted genomic interventions to well-characterized genes for the purpose of modifying cellular metabolism, but in some cases, even for the model organism Escherichia coli, alternative strategies are required to achieve a desired phenotype. Metabolic evolution involves applying selective pressure to a population, and over time advantageous mutations will arise that improve organism fitness. To understand what mutations occurred during these experiments and how they affect phenotype, whole genome sequencing is required, followed by mutation analysis and strain characterization. Genome sequencing generates a large amount of data for researchers to examine and traditionally mutation analysis focuses only on gene variations. Supporting mutation analysis with computational tools and using a systems-level approach that utilizes public databases describing gene regulation and cellular metabolism improves upon existing analysis techniques and advances our understanding of how genotype relates to phenotype. Using our mutation analysis software, E. coli Variant Analysis (EVA), we examine antibiotic resistance, benzoate tolerance, and octanoic acid tolerance in E. coli. Our analysis pipeline includes a defined set of rules for mutation categorization. Prioritization of mutations supports efforts to reverse-engineer evolved strains and focus on the variants most likely to be damaging or relevant to phenotype. From mutation analysis results, we construct biological networks for visualization of mutations and possible downstream effects. This allows for improved mutation interpretation and identification of possible mutation interactions. Furthermore, we integrate RNA-seq data into our analysis to investigate the effects of variant regulators on the transcriptome. In contrast to existing methods which focus on mutated genes, we incorporate annotations for binding sites and other regulatory features on the genome for the most complete interpretation based on the available genome and gene regulatory models
    corecore