312,316 research outputs found

    Adaptive evolution of transcription factor binding sites

    Get PDF
    The regulation of a gene depends on the binding of transcription factors to specific sites located in the regulatory region of the gene. The generation of these binding sites and of cooperativity between them are essential building blocks in the evolution of complex regulatory networks. We study a theoretical model for the sequence evolution of binding sites by point mutations. The approach is based on biophysical models for the binding of transcription factors to DNA. Hence we derive empirically grounded fitness landscapes, which enter a population genetics model including mutations, genetic drift, and selection. We show that the selection for factor binding generically leads to specific correlations between nucleotide frequencies at different positions of a binding site. We demonstrate the possibility of rapid adaptive evolution generating a new binding site for a given transcription factor by point mutations. The evolutionary time required is estimated in terms of the neutral (background) mutation rate, the selection coefficient, and the effective population size. The efficiency of binding site formation is seen to depend on two joint conditions: the binding site motif must be short enough and the promoter region must be long enough. These constraints on promoter architecture are indeed seen in eukaryotic systems. Furthermore, we analyse the adaptive evolution of genetic switches and of signal integration through binding cooperativity between different sites. Experimental tests of this picture involving the statistics of polymorphisms and phylogenies of sites are discussed.Comment: published versio

    Biophysical Fitness Landscapes for Transcription Factor Binding Sites

    Full text link
    Evolutionary trajectories and phenotypic states available to cell populations are ultimately dictated by intermolecular interactions between DNA, RNA, proteins, and other molecular species. Here we study how evolution of gene regulation in a single-cell eukaryote S. cerevisiae is affected by the interactions between transcription factors (TFs) and their cognate genomic sites. Our study is informed by high-throughput in vitro measurements of TF-DNA binding interactions and by a comprehensive collection of genomic binding sites. Using an evolutionary model for monomorphic populations evolving on a fitness landscape, we infer fitness as a function of TF-DNA binding energy for a collection of 12 yeast TFs, and show that the shape of the predicted fitness functions is in broad agreement with a simple thermodynamic model of two-state TF-DNA binding. However, the effective temperature of the model is not always equal to the physical temperature, indicating selection pressures in addition to biophysical constraints caused by TF-DNA interactions. We find little statistical support for the fitness landscape in which each position in the binding site evolves independently, showing that epistasis is common in evolution of gene regulation. Finally, by correlating TF-DNA binding energies with biological properties of the sites or the genes they regulate, we are able to rule out several scenarios of site-specific selection, under which binding sites of the same TF would experience a spectrum of selection pressures depending on their position in the genome. These findings argue for the existence of universal fitness landscapes which shape evolution of all sites for a given TF, and whose properties are determined in part by the physics of protein-DNA interactions

    Evaluation of genome-wide chromatin library of Stat5 binding sites in human breast cancer

    Get PDF
    BACKGROUND: There is considerable interest in identifying target genes and chromatin binding sites for transcription factors in a genome-wide manner. Such information may become useful in diagnosis and treatment of disease, drug target identification, and for prognostication. In cancer diagnosis, patterns of transcription factor binding to specific regulatory chromatin elements are expected to complement and enhance current diagnostic predictions of tumor behavior based on protein and mRNA analyses. Signal transducer and activator of transcription-5 (Stat5) is a cytokine-activated transcription factor implicated in growth and progression of many malignancies, including hematopoietic, prostate, and breast cancer. We have explored immunoaffinity purification of Stat5-bound chromatin from breast cancer cells to identify Stat5 target sites in an unbiased, genome-wide manner. RESULTS: In this report, we evaluate the efficacy of a Stat5-bound chromatin library to identify valid Stat5 chromatin binding sites within the oncogenome of T-47D human breast cancer cells. A general problem with cloning of immunocaptured, transcription factor-bound chromatin fragments is contamination with non-specific chromatin. However, using an optimized strategy, five out of ten randomly selected clones could be experimentally verified to bind Stat5 both in vitro and in vivo as tested by electrophoretic mobility shift assay and chromatin immunoprecipitation, respectively. While there was no binding to fragments lacking a Stat5 consensus binding sequence, presence of a Stat5 binding sequence did not assure binding. CONCLUSION: A chromatin library coupled with experimental validation may productively identify novel in vivo Stat5 chromatin binding sites in cancer, including abnormal regulatory sites in tumor-specific neochromatin

    The Functional Consequences of Variation in Transcription Factor Binding

    Full text link
    One goal of human genetics is to understand how the information for precise and dynamic gene expression programs is encoded in the genome. The interactions of transcription factors (TFs) with DNA regulatory elements clearly play an important role in determining gene expression outputs, yet the regulatory logic underlying functional transcription factor binding is poorly understood. Many studies have focused on characterizing the genomic locations of TF binding, yet it is unclear to what extent TF binding at any specific locus has functional consequences with respect to gene expression output. To evaluate the context of functional TF binding we knocked down 59 TFs and chromatin modifiers in one HapMap lymphoblastoid cell line. We then identified genes whose expression was affected by the knockdowns. We intersected the gene expression data with transcription factor binding data (based on ChIP-seq and DNase-seq) within 10 kb of the transcription start sites of expressed genes. This combination of data allowed us to infer functional TF binding. On average, 14.7% of genes bound by a factor were differentially expressed following the knockdown of that factor, suggesting that most interactions between TF and chromatin do not result in measurable changes in gene expression levels of putative target genes. We found that functional TF binding is enriched in regulatory elements that harbor a large number of TF binding sites, at sites with predicted higher binding affinity, and at sites that are enriched in genomic regions annotated as active enhancers.Comment: 30 pages, 6 figures (7 supplemental figures and 6 supplemental tables available upon request to [email protected]). Submitted to PLoS Genetic

    Identification of transcription factor binding sites in promoter databases

    Get PDF
    Transcription factors (TFs) are the proteins which regulates the expression of their target genes either in a positive or negative manner. TFs realize this task by binding to a specific DNA sequence contained in promoter regions, via their DNA binding motifs. Among ETS family TFs, Pea3 proteins are involved in the regulation of expression of genes, which are important for cell growth, development, differentiation, oncogenic transformation and apoptosis. In silico studies should be done to find out the novel target genes for this TF. Even though a few bioinformatics tools are available for this purpose, the user needs to go back and forth between different tools, and to repeat these steps for each of their candidate gene. Here we combined these tools and constituted a new tool which examines the affinity of any TF towards the selected target genes’ promoter sequences. The tool is tested on several genes, which are predicted to be regulated by Pea3 TF

    Structural fingerprints of transcription factor binding site regions

    Get PDF
    Fourier transforms are a powerful tool in the prediction of DNA sequence properties, such as the presence/absence of codons. We have previously compiled a database of the structural properties of all 32,896 unique DNA octamers. In this work we apply Fourier techniques to the analysis of the structural properties of human chromosomes 21 and 22 and also to three sets of transcription factor binding sites within these chromosomes. We find that, for a given structural property, the structural property power spectra of chromosomes 21 and 22 are strikingly similar. We find common peaks in their power spectra for both Sp1 and p53 transcription factor binding sites. We use the power spectra as a structural fingerprint and perform similarity searching in order to find transcription factor binding site regions. This approach provides a new strategy for searching the genome data for information. Although it is difficult to understand the relationship between specific functional properties and the set of structural parameters in our database, our structural fingerprints nevertheless provide a useful tool for searching for function information in sequence data. The power spectrum fingerprints provide a simple, fast method for comparing a set of functional sequences, in this case transcription factor binding site regions, with the sequences of whole chromosomes. On its own, the power spectrum fingerprint does not find all transcription factor binding sites in a chromosome, but the results presented here show that in combination with other approaches, this technique will improve the chances of identifying functional sequences hidden in genomic data

    Predicting Combinatorial Binding of Transcription Factors to Regulatory Elements in the Human Genome by Association Rule Mining

    Get PDF
    Cis-acting transcriptional regulatory elements in mammalian genomes typically contain specific combinations of binding sites for various transcription factors. Although some cisregulatory elements have been well studied, the combinations of transcription factors that regulate normal expression levels for the vast majority of the 20,000 genes in the human genome are unknown. We hypothesized that it should be possible to discover transcription factor combinations that regulate gene expression in concert by identifying over-represented combinations of sequence motifs that occur together in the genome. In order to detect combinations of transcription factor binding motifs, we developed a data mining approach based on the use of association rules, which are typically used in market basket analysis. We scored each segment of the genome for the presence or absence of each of 83 transcription factor binding motifs, then used association rule mining algorithms to mine this dataset, thus identifying frequently occurring pairs of distinct motifs within a segment. Results: Support for most pairs of transcription factor binding motifs was highly correlated across different chromosomes although pair significance varied. Known true positive motif pairs showed higher association rule support, confidence, and significance than background. Our subsets of high-confidence, high-significance mined pairs of transcription factors showed enrichment for co-citation in PubMed abstracts relative to all pairs, and the predicted associations were often readily verifiable in the literature. Conclusion: Functional elements in the genome where transcription factors bind to regulate expression in a combinatorial manner are more likely to be predicted by identifying statistically and biologically significant combinations of transcription factor binding motifs than by simply scanning the genome for the occurrence of binding sites for a single transcription factor.NIAAA Alcohol Training GrantNational Science FoundationCellular and Molecular Biolog
    corecore