11 research outputs found
Distinguishing direct versus indirect transcription factorâDNA interactions
Transcriptional regulation is largely enacted by transcription factors (TFs) binding DNA. Large numbers of TF binding motifs have been revealed by ChIP-chip experiments followed by computational DNA motif discovery. However, the success of motif discovery algorithms has been limited when applied to sequences bound in vivo (such as those identified by ChIP-chip) because the observed TFâDNA interactions are not necessarily direct: Some TFs predominantly associate with DNA indirectly through protein partners, while others exhibit both direct and indirect binding. Here, we present the first method for distinguishing between direct and indirect TFâDNA interactions, integrating in vivo TF binding data, in vivo nucleosome occupancy data, and motifs from in vitro protein binding microarray experiments. When applied to yeast ChIP-chip data, our method reveals that only 48% of the data sets can be readily explained by direct binding of the profiled TF, while 16% can be explained by indirect DNA binding. In the remaining 36%, none of the motifs used in our analysis was able to explain the ChIP-chip data, either because the data were too noisy or because the set of motifs was incomplete. As more in vitro TF DNA binding motifs become available, our method could be used to build a complete catalog of direct and indirect TFâDNA interactions. Our method is not restricted to yeast or to ChIP-chip data, but can be applied in any system for which both in vivo binding data and in vitro DNA binding motifs are available.National Science Foundation (U.S.). (CAREER Award 0347801
Sequence specificity incompletely defines the genome-wide occupancy of Myc
BACKGROUND: The Myc-Max heterodimer is a transcription factor that regulates expression of a large number of genes. Genome occupancy of Myc-Max is thought to be driven by Enhancer box (E-box) DNA elements, CACGTG or variants, to which the heterodimer binds in vitro. RESULTS: By analyzing ChIP-Seq datasets, we demonstrate that the positions occupied by Myc-Max across the human genome correlate with the RNA polymerase II, Pol II, transcription machinery significantly better than with E-boxes. Metagene analyses show that in promoter regions, Myc is uniformly positioned about 100 bp upstream of essentially all promoter proximal paused polymerases with Max about 15 bp upstream of Myc. We re-evaluate the DNA binding properties of full length Myc-Max proteins. Electrophoretic mobility shift assay results demonstrate Myc-Max heterodimers display significant sequence preference, but have high affinity for any DNA. Quantification of the relative affinities of Myc-Max for all possible 8-mers using universal protein-binding microarray assays shows that sequences surrounding core 6-mers significantly affect binding. Compared to the in vitro sequence preferences, Myc-Max genomic occupancy measured by ChIP-Seq is largely, although not completely, independent of sequence specificity. CONCLUSIONS: We quantified the affinity of Myc-Max to all possible 8-mers and compared this with the sites of Myc binding across the human genome. Our results indicate that the genomic occupancy of Myc cannot be explained by its intrinsic DNA specificity and suggest that the transcription machinery and associated promoter accessibility play a predominant role in Myc recruitment. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-014-0482-3) contains supplementary material, which is available to authorized users
DNA Sequence Context Controls the Binding and Processivity of the T7 DNA Primase
Summary: Primases are key enzymes involved in DNA replication. They act on single-stranded DNA and catalyze the synthesis of short RNA primers used by DNA polymerases. Here, we investigate the DNA binding and activity of the bacteriophage T7 primase using a new workflow called high-throughput primase profiling (HTPP). Using a unique combination of high-throughput binding assays and biochemical analyses, HTPP reveals a complex landscape of binding specificity and functional activity for the T7 primase, determined by sequences flanking the primase recognition site. We identified specific features, such as G/T-rich flanks, which increase primase-DNA binding up to 10-fold and, surprisingly, also increase the length of newly formed RNA (up to 3-fold). To our knowledge, variability in primer length has not been reported for this primase. We expect that applying HTPP to additional enzymes will reveal new insights into the effects of DNA sequence composition on the DNA recognition and functional activity of primases. : Biochemical Mechanism; Molecular Biology; Molecular Genetics Subject Areas: Biochemical Mechanism, Molecular Biology, Molecular Genetic
Curated collection of yeast transcription factor DNA binding specificity data reveals novel structural and gene regulatory insights
Background: Transcription factors (TFs) play a central role in regulating gene expression by interacting with cis-regulatory DNA elements associated with their target genes. Recent surveys have examined the DNA binding specificities of most Saccharomyces cerevisiae TFs, but a comprehensive evaluation of their data has been lacking. Results: We analyzed in vitro and in vivo TF-DNA binding data reported in previous large-scale studies to generate a comprehensive, curated resource of DNA binding specificity data for all characterized S. cerevisiae TFs. Our collection comprises DNA binding site motifs and comprehensive in vitro DNA binding specificity data for all possible 8-bp sequences. Investigation of the DNA binding specificities within the basic leucine zipper (bZIP) and VHT1 regulator (VHR) TF families revealed unexpected plasticity in TF-DNA recognition: intriguingly, the VHR TFs, newly characterized by protein binding microarrays in this study, recognize bZIP-like DNA motifs, while the bZIP TF Hac1 recognizes a motif highly similar to the canonical E-box motif of basic helix-loop-helix (bHLH) TFs. We identified several TFs with distinct primary and secondary motifs, which might be associated with different regulatory functions. Finally, integrated analysis of in vivo TF binding data with protein binding microarray data lends further support for indirect DNA binding in vivo by sequence-specific TFs. Conclusions: The comprehensive data in this curated collection allow for more accurate analyses of regulatory TF-DNA interactions, in-depth structural studies of TF-DNA specificity determinants, and future experimental investigations of the TFs' predicted target genes and regulatory roles.National Human Genome Research Institute (U.S.) (grant R01 HG003420)National Human Genome Research Institute (U.S.) (grant R01 HG003985)American Heart Association (postdoctoral fellowship 10POST3650060
Recommended from our members
Short tandem repeats recruit transcription factors to tune eukaryotic gene expression
Whole-Genome and Epigenomic Landscapes of Etiologically Distinct Subtypes of Cholangiocarcinoma
Cholangiocarcinoma (CCA) is a hepatobiliary malignancy exhibiting high incidence in countries with endemic liver-fluke infection. We analysed 489 CCAs from 10 countries, combining whole-genome (71 cases), targeted/exome, copy-number, gene expression, and DNA methylation information. Integrative clustering defined four CCA clusters - Fluke-Positive CCAs (Clusters 1/2) are enriched in ERBB2 amplifications and TP53 mutations, conversely Fluke-Negative CCAs (Clusters 3/4) exhibit high copy-number alterations and PD-1/PD-L2 expression, or epigenetic mutations (IDH1/2, BAP1) and FGFR/PRKA-related gene rearrangements. Whole-genome analysis highlighted FGFR2 3'UTR deletion as a mechanism of FGFR2 upregulation. Integration of non-coding promoter mutations with protein-DNA binding profiles demonstrates pervasive modulation of H3K27me3-associated sites in CCA. Clusters 1 and 4 exhibit distinct DNA hypermethylation patterns targeting either CpG islands or shores - mutation signature and subclonality analysis suggests that these reflect different mutational pathways. Our results exemplify how genetics, epigenetics and environmental carcinogens can interplay across different geographies to generate distinct molecular subtypes of cancer