112 research outputs found

    Inferring direct regulatory targets from expression and genome location analyses: a comparison of transcription factor deletion and overexpression

    Get PDF
    BACKGROUND: Effects on gene expression due to environmental or genetic changes can be easily measured using microarrays. However, indirect effects on expression can be substantial. The indirect effects of a perturbation need to be distinguished from the direct effects if we are to understand the structure and behavior of regulatory networks. RESULTS: The most direct way to perturb a transcriptional network is to alter transcription factor activity. Here, for the first time, we compare expression changes and genomic binding in a simple regulon under conditions of both low and high transcription factor activity. Specifically, we assessed the effects on expression and binding due to deletion of the yeast LEU3 transcription factor gene and effects due to elevation of Leu3 activity. Leu3 activity was elevated through overexpression and the introduction of a mutation that renders the protein constitutively active. Genes that are bound and/or regulated by Leu3 under one or both conditions were characterized in terms of their functional annotations and their predicted potential to be bound by Leu3. We also assessed the evolutionary conservation of the predicted binding potential using a novel alignment-independent method. Both perturbations yield genes that are likely to be direct targets of Leu3, including most of the classically defined targets. Additional direct targets are identified by each of the methods. However, experimental and computational criteria suggest that most genes whose expression is affected by the Leu3 genotype are unlikely to be regulated by binding of the protein. CONCLUSION: Most genes that are differentially expressed by Leu3 are not direct targets despite the exceptional simplicity of the regulon, and the unusually direct nature of the perturbations investigated. These conclusions are reached through computational analyses that support and extend chromatin immunoprecipitation data on the identities of direct targets. These results have implications for the interpretation of expression experiments, especially in cases for which chromatin immunoprecipitation data are unavailable, incomplete, or ambiguous

    Unraveling networks of co-regulated genes on the sole basis of genome sequences

    Get PDF
    With the growing number of available microbial genome sequences, regulatory signals can now be revealed as conserved motifs in promoters of orthologous genes (phylogenetic footprints). A next challenge is to unravel genome-scale regulatory networks. Using as sole input genome sequences, we predicted cis-regulatory elements for each gene of the yeast Saccharomyces cerevisiae by discovering over-represented motifs in the promoters of their orthologs in 19 Saccharomycetes species. We then linked all genes displaying similar motifs in their promoter regions and inferred a co-regulation network including 56ā€‰919 links between 3171 genes. Comparison with annotated regulons highlights the high predictive value of the method: a majority of the top-scoring predictions correspond to already known co-regulations. We also show that this inferred network is as accurate as a co-expression network built from hundreds of transcriptome microarray experiments. Furthermore, we experimentally validated 14 among 16 new functional links between orphan genes and known regulons. This approach can be readily applied to unravel gene regulatory networks from hundreds of microbial genomes for which no other information is available except the sequence. Long-term benefits can easily be perceived when considering the exponential increase of new genome sequences

    Mapping Transcription Factor Networks and Elucidating Their Biological Determinants

    Get PDF
    A central goal in systems biology is to accurately map the transcription factor (TF) network of a cell. Such a network map is a key component for many downstream applications, from developmental biology to transcriptome engineering, and from disease modeling to drug discovery. Building a reliable network map requires a wide range of data sources including TF binding locations and gene expression data after direct TF perturbations. However, we are facing two roadblocks. First, rich resources are available only for a few well-studied systems and cannot be easily replicated for new organisms or cell types. Second, when TF binding and TF- perturbation response data are available, they rarely converge on a common set of direct and functional targets for a TF. This dissertation explores and validates the best combination of experimental and analytic techniques to map TF networks. First, we introduce an unsupervised inference algorithm that maps TF networks by exploiting only gene expression and genome sequence data. We show that our ā€œdata lightā€ method is more accurate at identifying direct targets of TFs than other similar methods. Second, we develop an optimization method to search for a convergent set of target genes that are independently identified by binding locations and perturbation responses of each TF. Combining this method with network inference greatly expanded the high-confidence network maps, especially when applied on datasets obtained by using recently developed experimental methods. Third, we describe a framework for predicting each geneā€™s responsiveness to a TF perturbation from genomic features. Using this framework, we identified properties of each gene that are independent of the perturbed TF as the major determinants of TF-perturbation responsiveness. This may lead to improvements in network mapping algorithms that exploit TF perturbation responses. Overall, this dissertation provides a scalable framework for mapping high-quality TF networks for a variety of organisms and cell types
    • ā€¦
    corecore