37 research outputs found

    Integrating motif, DNA accessibility and gene expression data to build regulatory maps in an organism

    Get PDF
    Characterization of cell type specific regulatory networks and elements is a major challenge in genomics, and emerging strategies frequently employ high-throughput genome-wide assays of transcription factor (TF) to DNA binding, histone modifications or chromatin state. However, these experiments remain too difficult/expensive for many laboratories to apply comprehensively to their system of interest. Here, we explore the potential of elucidating regulatory systems in varied cell types using computational techniques that rely on only data of gene expression, low-resolution chromatin accessibility, and TF-DNA binding specificities (\u27motifs\u27). We show that static computational motif scans overlaid with chromatin accessibility data reasonably approximate experimentally measured TF-DNA binding. We demonstrate that predicted binding profiles and expression patterns of hundreds of TFs are sufficient to identify major regulators of approximately 200 spatiotemporal expression domains in the Drosophila embryo. We are then able to learn reliable statistical models of enhancer activity for over 70 expression domains and apply those models to annotate domain specific enhancers genome-wide. Throughout this work, we apply our motif and accessibility based approach to comprehensively characterize the regulatory network of fruitfly embryonic development and show that the accuracy of our computational method compares favorably to approaches that rely on data from many experimental assays. Acids Research

    Core and region-enriched networks of behaviorally regulated genes and the singing genome

    Get PDF
    Songbirds represent an important model organism for elucidating molecular mechanisms that link genes with complex behaviors, in part because they have discrete vocal learning circuits that have parallels with those that mediate human speech. We found that ~10% of the genes in the avian genome were regulated by singing, and we found a striking regional diversity of both basal and singing-induced programs in the four key song nuclei of the zebra finch, a vocal learning songbird. The region-enriched patterns were a result of distinct combinations of region-enriched transcription factors (TFs), their binding motifs, and presinging acetylation of histone 3 at lysine 27 (H3K27ac) enhancer activity in the regulatory regions of the associated genes. RNA interference manipulations validated the role of the calcium-response transcription factor (CaRF) in regulating genes preferentially expressed in specific song nuclei in response to singing. Thus, differential combinatorial binding of a small group of activity-regulated TFs and predefined epigenetic enhancer activity influences the anatomical diversity of behaviorally regulated gene networks

    Cross‐species systems analysis of evolutionary toolkits of neurogenomic response to social challenge

    Full text link
    Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/147855/1/gbb12502.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/147855/2/gbb12502-sup-0002-TableS1.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/147855/3/gbb12502_am.pd

    Global analysis of Drosophila Cys2-His2 zinc finger proteins reveals a multitude of novel recognition motifs and binding determinants

    Get PDF
    Cys2-His2 zinc finger proteins (ZFPs) are the largest group of transcription factors in higher metazoans. A complete characterization of these ZFPs and their associated target sequences is pivotal to fully annotate transcriptional regulatory networks in metazoan genomes. As a first step in this process, we have characterized the DNA-binding specificities of 129 zinc finger sets from Drosophila using a bacterial one-hybrid system. This data set contains the DNA-binding specificities for at least one encoded ZFP from 70 unique genes and 23 alternate splice isoforms representing the largest set of characterized ZFPs from any organism described to date. These recognition motifs can be used to predict genomic binding sites for these factors within the fruit fly genome. Subsets of fingers from these ZFPs were characterized to define their orientation and register on their recognition sequences, thereby allowing us to define the recognition diversity within this finger set. We find that the characterized fingers can specify 47 of the 64 possible DNA triplets. To confirm the utility of our finger recognition models, we employed subsets of Drosophila fingers in combination with an existing archive of artificial zinc finger modules to create ZFPs with novel DNA-binding specificity. These hybrids of natural and artificial fingers can be used to create functional zinc finger nucleases for editing vertebrate genomes

    Thermodynamics-Based Models of Transcriptional Regulation by Enhancers: The Roles of Synergistic Activation, Cooperative Binding and Short-Range Repression

    Get PDF
    Quantitative models of cis-regulatory activity have the potential to improve our mechanistic understanding of transcriptional regulation. However, the few models available today have been based on simplistic assumptions about the sequences being modeled, or heuristic approximations of the underlying regulatory mechanisms. We have developed a thermodynamics-based model to predict gene expression driven by any DNA sequence, as a function of transcription factor concentrations and their DNA-binding specificities. It uses statistical thermodynamics theory to model not only protein-DNA interaction, but also the effect of DNA-bound activators and repressors on gene expression. In addition, the model incorporates mechanistic features such as synergistic effect of multiple activators, short range repression, and cooperativity in transcription factor-DNA binding, allowing us to systematically evaluate the significance of these features in the context of available expression data. Using this model on segmentation-related enhancers in Drosophila, we find that transcriptional synergy due to simultaneous action of multiple activators helps explain the data beyond what can be explained by cooperative DNA-binding alone. We find clear support for the phenomenon of short-range repression, where repressors do not directly interact with the basal transcriptional machinery. We also find that the binding sites contributing to an enhancer's function may not be conserved during evolution, and a noticeable fraction of these undergo lineage-specific changes. Our implementation of the model, called GEMSTAT, is the first publicly available program for simultaneously modeling the regulatory activities of a given set of sequences

    The Transcription Factor Ultraspiracle Influences Honey Bee Social Behavior and Behavior-Related Gene Expression

    Get PDF
    Behavior is among the most dynamic animal phenotypes, modulated by a variety of internal and external stimuli. Behavioral differences are associated with large-scale changes in gene expression, but little is known about how these changes are regulated. Here we show how a transcription factor (TF), ultraspiracle (usp; the insect homolog of the Retinoid X Receptor), working in complex transcriptional networks, can regulate behavioral plasticity and associated changes in gene expression. We first show that RNAi knockdown of USP in honey bee abdominal fat bodies delayed the transition from working in the hive (primarily “nursing” brood) to foraging outside. We then demonstrate through transcriptomics experiments that USP induced many maturation-related transcriptional changes in the fat bodies by mediating transcriptional responses to juvenile hormone. These maturation-related transcriptional responses to USP occurred without changes in USP's genomic binding sites, as revealed by ChIP–chip. Instead, behaviorally related gene expression is likely determined by combinatorial interactions between USP and other TFs whose cis-regulatory motifs were enriched at USP's binding sites. Many modules of JH– and maturation-related genes were co-regulated in both the fat body and brain, predicting that usp and cofactors influence shared transcriptional networks in both of these maturation-related tissues. Our findings demonstrate how “single gene effects” on behavioral plasticity can involve complex transcriptional networks, in both brain and peripheral tissues

    Quantitative analysis of the Drosophila segmentation regulatory network using pattern generating potentials

    Get PDF
    Cis-regulatory modules that drive precise spatial-temporal patterns of gene expression are central to the process of metazoan development. We describe a new computational strategy to annotate genomic sequences based on their ‘‘pattern generating potential’ ’ and to produce quantitative descriptions of transcriptional regulatory networks at the level of individual protein-module interactions. We use this approach to convert the qualitative understanding of interactions that regulate Drosophila segmentation into a network model in which a confidence value is associated with each transcription factor-module interaction. Sequence information from multiple Drosophila species is integrated with transcription factor binding specificities to determine conserved binding site frequencies across the genome. These binding site profiles are combined with transcription factor expression information to create a model to predict module activity patterns. This model is used to scan genomic sequences for the potential to generate all or part of the expression pattern o

    Computational Identification of Diverse Mechanisms Underlying Transcription Factor-DNA Occupancy

    Get PDF
    <div><p>ChIP-based genome-wide assays of transcription factor (TF) occupancy have emerged as a powerful, high-throughput method to understand transcriptional regulation, especially on a global scale. This has led to great interest in the underlying biochemical mechanisms that direct TF-DNA binding, with the ultimate goal of computationally predicting a TF's occupancy profile in any cellular condition. In this study, we examined the influence of various potential determinants of TF-DNA binding on a much larger scale than previously undertaken. We used a thermodynamics-based model of TF-DNA binding, called “STAP,” to analyze 45 TF-ChIP data sets from <i>Drosophila</i> embryonic development. We built a cross-validation framework that compares a baseline model, based on the ChIP'ed (“primary”) TF's motif, to more complex models where binding by secondary TFs is hypothesized to influence the primary TF's occupancy. Candidates interacting TFs were chosen based on RNA-SEQ expression data from the time point of the ChIP experiment. We found widespread evidence of both cooperative and antagonistic effects by secondary TFs, and explicitly quantified these effects. We were able to identify multiple classes of interactions, including (1) long-range interactions between primary and secondary motifs (separated by ≤150 bp), suggestive of indirect effects such as chromatin remodeling, (2) short-range interactions with specific inter-site spacing biases, suggestive of direct physical interactions, and (3) overlapping binding sites suggesting competitive binding. Furthermore, by factoring out the previously reported strong correlation between TF occupancy and DNA accessibility, we were able to categorize the effects into those that are likely to be mediated by the secondary TF's effect on local accessibility and those that utilize accessibility-independent mechanisms. Finally, we conducted <i>in vitro</i> pull-down assays to test model-based predictions of short-range cooperative interactions, and found that seven of the eight TF pairs tested physically interact and that some of these interactions mediate cooperative binding to DNA.</p></div

    Effect of cooperative interactions between pairs of TFs on the accuracy of modeling ChIP data.

    No full text
    <p>STAP was used with two motifs – the primary motif (representing the ChIP'ed TF) and a secondary motif (“M2”), and with cooperative DNA-binding included in the model. Cooperative interaction between two TF sites was included in the model only if the two sites are within a pre-defined “Distance Threshold” set to 150 bp in this set of experiments. The correlation coefficient between STAP scores from this model (CC(M1+M2)) was compared to the CC when using only the primary motif (CC(M1)) or when using only the secondary motif (CC(M2)) in STAP. The respective improvements are noted as “ImprOverM1” and “ImprOverM2” respectively. The column “P-value” shows an empirically calculated p-value for the improvement, comparing the observed improvements to that expected from 100 shuffled versions of the secondary motif. The column “ImprOverM2” is the difference of CC(M1+M2) and the absolute value of CC(M2). The last column (“Z-score”) compares the observed improvement (ImprOverM1) to that obtained using other real motifs, corresponding to TFs expressed highly in that developmental stage, as the secondary motif. Shown here is only the single strongest secondary motif influence on each data set, if its P-value is ≤0.05 and Z-score is ≥3. The complete list of significant effects is in Supplementary Materials (<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003571#pgen.1003571.s014" target="_blank">Table S2</a>). Note that all results are from cross-validation and thus account for the additional parameters in the two motifs model compared to the one motif model.</p
    corecore