794 research outputs found

    Computational detection of genomic cis-regulatory modules applied to body patterning in the early Drosophila embryo

    Get PDF
    BACKGROUND: Regulation of gene transcription is crucial for the function and development of all organisms. While gene prediction programs that identify protein coding sequence are used with remarkable success in the annotation of genomes, the development of computational methods to analyze noncoding regions and to delineate transcriptional control elements is still in its infancy. RESULTS: Here we present novel algorithms to detect cis-regulatory modules through genome wide scans for clusters of transcription factor binding sites using three levels of prior information. When binding sites for the factors are known, our statistical segmentation algorithm, Ahab, yields about 150 putative gap gene regulated modules, with no adjustable parameters other than a window size. If one or more related modules are known, but no binding sites, repeated motifs can be found by a customized Gibbs sampler and input to Ahab, to predict genes with similar regulation. Finally using only the genome, we developed a third algorithm, Argos, that counts and scores clusters of overrepresented motifs in a window of sequence. Argos recovers many of the known modules, upstream of the segmentation genes, with no training data. CONCLUSIONS: We have demonstrated, in the case of body patterning in the Drosophila embryo, that our algorithms allow the genome-wide identification of regulatory modules. We believe that Ahab overcomes many problems of recent approaches and we estimated the false positive rate to be about 50%. Argos is the first successful attempt to predict regulatory modules using only the genome without training data. Complete results and module predictions across the Drosophila genome are available at http://uqbar.rockefeller.edu/~siggia/

    Conservation of regulatory elements between two species of Drosophila

    Get PDF
    BACKGROUND: One of the important goals in the post-genomic era is to determine the regulatory elements within the non-coding DNA of a given organism's genome. The identification of functional cis-regulatory modules has proven difficult since the component factor binding sites are small and the rules governing their arrangement are poorly understood. However, the genomes of suitably diverged species help to predict regulatory elements based on the generally accepted assumption that conserved blocks of genomic sequence are likely to be functional. To judge the efficacy of strategies that prefilter by sequence conservation it is important to know to what extent the converse assumption holds, namely that functional elements common to both species will fall within these conserved blocks. The recently completed sequence of a second Drosophila species provides an opportunity to test this assumption for one of the experimentally best studied regulatory networks in multicellular organisms, the body patterning of the fly embryo. RESULTS: We find that 50%ā€“70% of known binding sites reside in conserved sequence blocks, but these percentages are not greatly enriched over what is expected by chance. Finally, a computational genome-wide search in both species for regulatory modules based on clusters of binding sites suggests that genes central to the regulatory network are consistently recovered. CONCLUSIONS: Our results indicate that binding sites remain clustered for these "core modules" while not necessarily residing in conserved blocks. This is an important clue as to how regulatory information is encoded in the genome and how modules evolve

    Challenges for modeling global gene regulatory networks during development: Insights from Drosophila

    Get PDF
    AbstractDevelopment is regulated by dynamic patterns of gene expression, which are orchestrated through the action of complex gene regulatory networks (GRNs). Substantial progress has been made in modeling transcriptional regulation in recent years, including qualitative ā€œcoarse-grainā€ models operating at the gene level to very ā€œfine-grainā€ quantitative models operating at the biophysical ā€œtranscription factor-DNA levelā€. Recent advances in genome-wide studies have revealed an enormous increase in the size and complexity or GRNs. Even relatively simple developmental processes can involve hundreds of regulatory molecules, with extensive interconnectivity and cooperative regulation. This leads to an explosion in the number of regulatory functions, effectively impeding Boolean-based qualitative modeling approaches. At the same time, the lack of information on the biophysical properties for the majority of transcription factors within a global network restricts quantitative approaches. In this review, we explore the current challenges in moving from modeling medium scale well-characterized networks to more poorly characterized global networks. We suggest to integrate coarse- and find-grain approaches to model gene regulatory networks in cis. We focus on two very well-studied examples from Drosophila, which likely represent typical developmental regulatory modules across metazoans

    Transcriptional Control in the Segmentation Gene Network of Drosophila

    Get PDF
    The segmentation gene network of Drosophila consists of maternal and zygotic factors that generate, by transcriptional (cross-) regulation, expression patterns of increasing complexity along the anterior-posterior axis of the embryo. Using known binding site information for maternal and zygotic gap transcription factors, the computer algorithm Ahab recovers known segmentation control elements (modules) with excellent success and predicts many novel modules within the network and genome-wide. We show that novel module predictions are highly enriched in the network and typically clustered proximal to the promoter, not only upstream, but also in intronic space and downstream. When placed upstream of a reporter gene, they consistently drive patterned blastoderm expression, in most cases faithfully producing one or more pattern elements of the endogenous gene. Moreover, we demonstrate for the entire set of known and newly validated modules that Ahab's prediction of binding sites correlates well with the expression patterns produced by the modules, revealing basic rules governing their composition. Specifically, we show that maternal factors consistently act as activators and that gap factors act as repressors, except for the bimodal factor Hunchback. Our data suggest a simple context-dependent rule for its switch from repressive to activating function. Overall, the composition of modules appears well fitted to the spatiotemporal distribution of their positive and negative input factors. Finally, by comparing Ahab predictions with different categories of transcription factor input, we confirm the global regulatory structure of the segmentation gene network, but find odd skipped behaving like a primary pair-rule gene. The study expands our knowledge of the segmentation gene network by increasing the number of experimentally tested modules by 50%. For the first time, the entire set of validated modules is analyzed for binding site composition under a uniform set of criteria, permitting the definition of basic composition rules. The study demonstrates that computational methods are a powerful complement to experimental approaches in the analysis of transcription networks

    Decoding transcription and microRNA-mediated translation control in Drosophila development

    Get PDF
    The spatio-temporal regulation of gene expression lies at the heart of animal development. In this article we present an overview of our recent work to apply systems biological approaches to the study of transcription and microRNA-mediated translation control in Drosophila development. We have identified many new cis-regulatory elements within the segmentation gene network, a transcriptional hierarchy governing pattern formation along the antero-posterior axis of the embryo, and developed a novel thermodynamic model to predict their expression. A similar thermodynamic approach that takes into account the secondary structure of the target mRNA significantly improves the prediction of microRNA binding sites

    Quantitative Models of the Mechanisms That Control Genome-Wide Patterns of Transcription Factor Binding during Early Drosophila Development

    Get PDF
    Transcription factors that drive complex patterns of gene expression during animal development bind to thousands of genomic regions, with quantitative differences in binding across bound regions mediating their activity. While we now have tools to characterize the DNA affinities of these proteins and to precisely measure their genome-wide distribution in vivo, our understanding of the forces that determine where, when, and to what extent they bind remains primitive. Here we use a thermodynamic model of transcription factor binding to evaluate the contribution of different biophysical forces to the binding of five regulators of early embryonic anterior-posterior patterning in Drosophila melanogaster. Predictions based on DNA sequence and in vitro protein-DNA affinities alone achieve a correlation of āˆ¼0.4 with experimental measurements of in vivo binding. Incorporating cooperativity and competition among the five factors, and accounting for spatial patterning by modeling binding in every nucleus independently, had little effect on prediction accuracy. A major source of error was the prediction of binding events that do not occur in vivo, which we hypothesized reflected reduced accessibility of chromatin. To test this, we incorporated experimental measurements of genome-wide DNA accessibility into our model, effectively restricting predicted binding to regions of open chromatin. This dramatically improved our predictions to a correlation of 0.6ā€“0.9 for various factors across known target genes. Finally, we used our model to quantify the roles of DNA sequence, accessibility, and binding competition and cooperativity. Our results show that, in regions of open chromatin, binding can be predicted almost exclusively by the sequence specificity of individual factors, with a minimal role for protein interactions. We suggest that a combination of experimentally determined chromatin accessibility data and simple computational models of transcription factor binding may be used to predict the binding landscape of any animal transcription factor with significant precision

    A systematic characterization of factors that regulate Drosophila segmentation via a bacterial one-hybrid system

    Get PDF
    Specificity data for groups of transcription factors (TFs) in a common regulatory network can be used to computationally identify the location of cis-regulatory modules in a genome. The primary limitation for this type of analysis is the paucity of specificity data that is available for the majority of TFs. We describe an omega-based bacterial one-hybrid system that provides a rapid method for characterizing DNA-binding specificities on a genome-wide scale. Using this system, 35 members of the Drosophila melanogaster segmentation network have been characterized, including representative members of all of the major classes of DNA-binding domains. A suite of web-based tools was created that uses this binding site dataset and phylogenetic comparisons to identify cis-regulatory modules throughout the fly genome. These tools allow specificities for any combination of factors to be used to perform rapid local or genome-wide searches for cis-regulatory modules. The utility of these factor specificities and tools is demonstrated on the well-characterized segmentation network. By incorporating specificity data on an additional 66 factors that we have characterized, our tools utilize āˆ¼14% of the predicted factors within the fly genome and provide an important new community resource for the identification of cis-regulatory modules

    Cross-species comparison significantly improves genome-wide prediction of cis-regulatory modules in Drosophila

    Get PDF
    BACKGROUND: The discovery of cis-regulatory modules in metazoan genomes is crucial for understanding the connection between genes and organism diversity. It is important to quantify how comparative genomics can improve computational detection of such modules. RESULTS: We run the Stubb software on the entire D. melanogaster genome, to obtain predictions of modules involved in segmentation of the embryo. Stubb uses a probabilistic model to score sequences for clustering of transcription factor binding sites, and can exploit multiple species data within the same probabilistic framework. The predictions are evaluated using publicly available gene expression data for thousands of genes, after careful manual annotation. We demonstrate that the use of a second genome (D. pseudoobscura) for cross-species comparison significantly improves the prediction accuracy of Stubb, and is a more sensitive approach than intersecting the results of separate runs over the two genomes. The entire list of predictions is made available online. CONCLUSION: Evolutionary conservation of modules serves as a filter to improve their detection in silico. The future availability of additional fruitfly genomes therefore carries the prospect of highly specific genome-wide predictions using Stubb

    Large-scale analysis of transcriptional cis-regulatory modules reveals both common features and distinct subclasses

    Get PDF
    Analysis of 280 experimentally-verified cis-regulatory modules from Drosophila reveal features both common to all and unique to distinct subclasses of modules

    Emerging properties of animal gene regulatory networks

    Get PDF
    Gene regulatory networks (GRNs) provide system level explanations of developmental and physiological functions in the terms of the genomic regulatory code. Depending on their developmental functions, GRNs differ in their degree of hierarchy, and also in the types of modular sub-circuit of which they are composed, although there is a commonly employed sub-circuit repertoire. Mathematical modelling of some types of GRN sub-circuit has deepened biological understanding of the functions they mediate. The structural organization of various kinds of GRN reflects their roles in the life process, and causally illuminates both developmental and evolutionary process
    • ā€¦
    corecore