35 research outputs found

    Distinguishing direct versus indirect transcription factor–DNA interactions

    Get PDF
    Transcriptional regulation is largely enacted by transcription factors (TFs) binding DNA. Large numbers of TF binding motifs have been revealed by ChIP-chip experiments followed by computational DNA motif discovery. However, the success of motif discovery algorithms has been limited when applied to sequences bound in vivo (such as those identified by ChIP-chip) because the observed TF–DNA interactions are not necessarily direct: Some TFs predominantly associate with DNA indirectly through protein partners, while others exhibit both direct and indirect binding. Here, we present the first method for distinguishing between direct and indirect TF–DNA interactions, integrating in vivo TF binding data, in vivo nucleosome occupancy data, and motifs from in vitro protein binding microarray experiments. When applied to yeast ChIP-chip data, our method reveals that only 48% of the data sets can be readily explained by direct binding of the profiled TF, while 16% can be explained by indirect DNA binding. In the remaining 36%, none of the motifs used in our analysis was able to explain the ChIP-chip data, either because the data were too noisy or because the set of motifs was incomplete. As more in vitro TF DNA binding motifs become available, our method could be used to build a complete catalog of direct and indirect TF–DNA interactions. Our method is not restricted to yeast or to ChIP-chip data, but can be applied in any system for which both in vivo binding data and in vitro DNA binding motifs are available.National Science Foundation (U.S.). (CAREER Award 0347801

    ScerTF: a comprehensive database of benchmarked position weight matrices for Saccharomyces species

    Get PDF
    Saccharomyces cerevisiae is a primary model for studies of transcriptional control, and the specificities of most yeast transcription factors (TFs) have been determined by multiple methods. However, it is unclear which position weight matrices (PWMs) are most useful; for the roughly 200 TFs in yeast, there are over 1200 PWMs in the literature. To address this issue, we created ScerTF, a comprehensive database of 1226 motifs from 11 different sources. We identified a single matrix for each TF that best predicts in vivo data by benchmarking matrices against chromatin immunoprecipitation and TF deletion experiments. We also used in vivo data to optimize thresholds for identifying regulatory sites with each matrix. To correct for biases from different methods, we developed a strategy to combine matrices. These aligned matrices outperform the best available matrix for several TFs. We used the matrices to predict co-occurring regulatory elements in the genome and identified many known TF combinations. In addition, we predict new combinations and provide evidence of combinatorial regulation from gene expression data. The database is available through a web interface at http://ural.wustl.edu/ScerTF. The site allows users to search the database with a regulatory site or matrix to identify the TFs most likely to bind the input sequence

    Statistical-mechanical lattice models for protein-DNA binding in chromatin

    Get PDF
    Statistical-mechanical lattice models for protein-DNA binding are well established as a method to describe complex ligand binding equilibriums measured in vitro with purified DNA and protein components. Recently, a new field of applications has opened up for this approach since it has become possible to experimentally quantify genome-wide protein occupancies in relation to the DNA sequence. In particular, the organization of the eukaryotic genome by histone proteins into a nucleoprotein complex termed chromatin has been recognized as a key parameter that controls the access of transcription factors to the DNA sequence. New approaches have to be developed to derive statistical mechanical lattice descriptions of chromatin-associated protein-DNA interactions. Here, we present the theoretical framework for lattice models of histone-DNA interactions in chromatin and investigate the (competitive) DNA binding of other chromosomal proteins and transcription factors. The results have a number of applications for quantitative models for the regulation of gene expression.Comment: 19 pages, 7 figures, accepted author manuscript, to appear in J. Phys.: Cond. Mat

    Pax6 interactions with chromatin and identification of its novel direct target genes in lens and forebrain.

    Get PDF
    Pax6 encodes a specific DNA-binding transcription factor that regulates the development of multiple organs, including the eye, brain and pancreas. Previous studies have shown that Pax6 regulates the entire process of ocular lens development. In the developing forebrain, Pax6 is expressed in ventricular zone precursor cells and in specific populations of neurons; absence of Pax6 results in disrupted cell proliferation and cell fate specification in telencephalon. In the pancreas, Pax6 is essential for the differentiation of α-, β- and δ-islet cells. To elucidate molecular roles of Pax6, chromatin immunoprecipitation experiments combined with high-density oligonucleotide array hybridizations (ChIP-chip) were performed using three distinct sources of chromatin (lens, forebrain and β-cells). ChIP-chip studies, performed as biological triplicates, identified a total of 5,260 promoters occupied by Pax6. 1,001 (133) of these promoter regions were shared between at least two (three) distinct chromatin sources, respectively. In lens chromatin, 2,335 promoters were bound by Pax6. RNA expression profiling from Pax6⁺/⁻ lenses combined with in vivo Pax6-binding data yielded 76 putative Pax6-direct targets, including the Gaa, Isl1, Kif1b, Mtmr2, Pcsk1n, and Snca genes. RNA and ChIP data were validated for all these genes. In lens cells, reporter assays established Kib1b and Snca as Pax6 activated and repressed genes, respectively. In situ hybridization revealed reduced expression of these genes in E14 cerebral cortex. Moreover, we examined differentially expressed transcripts between E9.5 wild type and Pax6⁻/⁻ lens placodes that suggested Efnb2, Fat4, Has2, Nav1, and Trpm3 as novel Pax6-direct targets. Collectively, the present studies, through the identification of Pax6-direct target genes, provide novel insights into the molecular mechanisms of Pax6 gene control during mouse embryonic development. In addition, the present data demonstrate that Pax6 interacts preferentially with promoter regions in a tissue-specific fashion. Nevertheless, nearly 20% of the regions identified are accessible to Pax6 in multiple tissues

    Is Transcription Factor Binding Site Turnover a Sufficient Explanation for Cis-Regulatory Sequence Divergence?

    Get PDF
    The molecular evolution of cis-regulatory sequences is not well understood. Comparisons of closely related species show that cis-regulatory sequences contain a large number of sites constrained by purifying selection. In contrast, there are a number of examples from distantly related species where cis-regulatory sequences retain little to no sequence similarity but drive similar patterns of gene expression. Binding site turnover, whereby the gain of a redundant binding site enables loss of a previously functional site, is one model by which cis-regulatory sequences can diverge without a concurrent change in function. To determine whether cis-regulatory sequence divergence is consistent with binding site turnover, we examined binding site evolution within orthologous intergenic sequences from 14 yeast species defined by their syntenic relationships with adjacent coding sequences. Both local and global alignments show that nearly all distantly related orthologous cis-regulatory sequences have no significant level of sequence similarity but are enriched for experimentally identified binding sites. Yet, a significant proportion of experimentally identified binding sites that are conserved in closely related species are absent in distantly related species and so cannot be explained by binding site turnover. Depletion of binding sites depends on the transcription factor but is detectable for a quarter of all transcription factors examined. Our results imply that binding site turnover is not a sufficient explanation for cis-regulatory sequence evolution

    Computational study of associations between histone modification and protein-DNA binding in yeast genome by integrating diverse information

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In parallel with the quick development of high-throughput technologies, <it>in vivo (vitro) </it>experiments for genome-wide identification of protein-DNA interactions have been developed. Nevertheless, a few questions remain in the field, such as how to distinguish true protein-DNA binding (functional binding) from non-specific protein-DNA binding (non-functional binding). Previous researches tackled the problem by integrated analysis of multiple available sources. However, few systematic studies have been carried out to examine the possible relationships between histone modification and protein-DNA binding. Here this issue was investigated by using publicly available histone modification data in yeast.</p> <p>Results</p> <p>Two separate histone modification datasets were studied, at both the open reading frame (ORF) and the promoter region of binding targets for 37 yeast transcription factors. Both results revealed a distinct histone modification pattern between the functional protein-DNA binding sites and non-functional ones for almost half of all TFs tested. Such difference is much stronger at the ORF than at the promoter region. In addition, a protein-histone modification interaction pathway can only be inferred from the functional protein binding targets.</p> <p>Conclusions</p> <p>Overall, the results suggest that histone modification information can be used to distinguish the functional protein-DNA binding from the non-functional, and that the regulation of various proteins is controlled by the modification of different histone lysines such as the protein-specific histone modification levels.</p

    YeTFaSCo: a database of evaluated yeast transcription factor sequence specificities

    Get PDF
    The yeast Saccharomyces cerevisiae is a prevalent system for the analysis of transcriptional networks. As a result, multiple DNA-binding sequence specificities (motifs) have been derived for most yeast transcription factors (TFs). However, motifs from different studies are often inconsistent with each other, making subsequent analyses complicated and confusing. Here, we have created YeTFaSCo (The Yeast Transcription Factor Specificity Compendium, http://yetfasco.ccbr.utoronto.ca/), an extensive collection of S. cerevisiae TF specificities. YeTFaSCo differs from related databases by being more comprehensive (including 1709 motifs for 256 proteins or protein complexes), and by evaluating the motifs using multiple objective quality metrics. The metrics include correlation between motif matches and ChIP-chip data, gene expression patterns, and GO terms, as well as motif agreement between different studies. YeTFaSCo also features an index of ‘expert-curated’ motifs, each associated with a confidence assessment. In addition, the database website features tools for motif analysis, including a sequence scanning function and precomputed genome-browser tracks of motif occurrences across the entire yeast genome. Users can also search the database for motifs that are similar to a query motif

    Tye7 regulates yeast Ty1 retrotransposon sense and antisense transcription in response to adenylic nucleotides stress

    Get PDF
    Transposable elements play a fundamental role in genome evolution. It is proposed that their mobility, activated under stress, induces mutations that could confer advantages to the host organism. Transcription of the Ty1 LTR-retrotransposon of Saccharomyces cerevisiae is activated in response to a severe deficiency in adenylic nucleotides. Here, we show that Ty2 and Ty3 are also stimulated under these stress conditions, revealing the simultaneous activation of three active Ty retrotransposon families. We demonstrate that Ty1 activation in response to adenylic nucleotide depletion requires the DNA-binding transcription factor Tye7. Ty1 is transcribed in both sense and antisense directions. We identify three Tye7 potential binding sites in the region of Ty1 DNA sequence where antisense transcription starts. We show that Tye7 binds to Ty1 DNA and regulates Ty1 antisense transcription. Altogether, our data suggest that, in response to adenylic nucleotide reduction, TYE7 is induced and activates Ty1 mRNA transcription, possibly by controlling Ty1 antisense transcription. We also provide the first evidence that Ty1 antisense transcription can be regulated by environmental stress conditions, pointing to a new level of control of Ty1 activity by stress, as Ty1 antisense RNAs play an important role in regulating Ty1 mobility at both the transcriptional and post-transcriptional stages

    Motif Enrichment Analysis: a unified framework and an evaluation on ChIP data

    Get PDF
    A major goal of molecular biology is determining the mechanisms that control the transcription of genes. Motif Enrichment Analysis (MEA) seeks to determine which DNA-binding transcription factors control the transcription of a set of genes by detecting enrichment of known binding motifs in the genes' regulatory regions. Typically, the biologist specifies a set of genes believed to be co-regulated and a library of known DNA-binding models for transcription factors, and MEA determines which (if any) of the factors may be direct regulators of the genes. Since the number of factors with known DNA-binding models is rapidly increasing as a result of high-throughput technologies, MEA is becoming increasingly useful. In this paper, we explore ways to make MEA applicable in more settings, and evaluate the efficacy of a number of MEA approaches.We first define a mathematical framework for Motif Enrichment Analysis that relaxes the requirement that the biologist input a selected set of genes. Instead, the input consists of all regulatory regions, each labeled with the level of a biological signal. We then define and implement a number of motif enrichment analysis methods. Some of these methods require a user-specified signal threshold, some identify an optimum threshold in a data-driven way and two of our methods are threshold-free. We evaluate these methods, along with two existing methods (Clover and PASTAA), using yeast ChIP-chip data. Our novel threshold-free method based on linear regression performs best in our evaluation, followed by the data-driven PASTAA algorithm. The Clover algorithm performs as well as PASTAA if the user-specified threshold is chosen optimally. Data-driven methods based on three statistical tests-Fisher Exact Test, rank-sum test, and multi-hypergeometric test--perform poorly, even when the threshold is chosen optimally. These methods (and Clover) perform even worse when unrestricted data-driven threshold determination is used.Our novel, threshold-free linear regression method works well on ChIP-chip data. Methods using data-driven threshold determination can perform poorly unless the range of thresholds is limited a priori. The limits implemented in PASTAA, however, appear to be well-chosen. Our novel algorithms--AME (Analysis of Motif Enrichment)-are available at http://bioinformatics.org.au/ame/

    ChIP on Chip: surprising results are often artifacts

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The method of chromatin immunoprecipitation combined with microarrays (ChIP-Chip) is a powerful tool for genome-wide analysis of protein binding. However, a high background signal is a common phenomenon.</p> <p>Results</p> <p>Reinvestigation of the chromatin immunoprecipitation procedure led us to discover four causes of high background: i) non-unique sequences, ii) incomplete reversion of crosslinks, iii) retention of protein in spin-columns and iv) insufficient RNase treatment. The chromatin immunoprecipitation method was modified and applied to analyze genome-wide binding of SeqA and σ<sup>32 </sup>in <it>Escherichia coli</it>.</p> <p>Conclusions</p> <p>False positive findings originating from these shortcomings of the method could explain surprising and contradictory findings in published ChIP-Chip studies. We present a modified chromatin immunoprecipitation method greatly reducing the background signal.</p
    corecore