Skip to main content
Article thumbnail
Location of Repository

High-throughput chromatin information enables accurate tissue-specific prediction of transcription factor binding sites

By Tom Whitington, Andrew C. Perkins and Timothy L. Bailey


In silico prediction of transcription factor binding sites (TFBSs) is central to the task of gene regulatory network elucidation. Genomic DNA sequence information provides a basis for these predictions, due to the sequence specificity of TF-binding events. However, DNA sequence alone is an impoverished source of information for the task of TFBS prediction in eukaryotes, as additional factors, such as chromatin structure regulate binding events. We show that incorporating high-throughput chromatin modification estimates can greatly improve the accuracy of in silico prediction of in vivo binding for a wide range of TFs in human and mouse. This improvement is superior to the improvement gained by equivalent use of either transcription start site proximity or phylogenetic conservation information. Importantly, predictions made with the use of chromatin structure information are tissue specific. This result supports the biological hypothesis that chromatin modulates TF binding to produce tissue-specific binding profiles in higher eukaryotes, and suggests that the use of chromatin modification information can lead to accurate tissue-specific transcriptional regulatory network elucidation

Topics: Computational Biology
Publisher: Oxford University Press
OAI identifier:
Provided by: PubMed Central
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://www.pubmedcentral.nih.g... (external link)
  • Suggested articles


    1. (2007). A chromatin landmark and transcription initiation at most promoters in human cells.
    2. (2006). A comparative analysis of genome-wide chromatin immunoprecipitation data for mammalian transcription factors.
    3. (2008). A core Klf circuitry regulates self-renewal of embryonic stem cells.
    4. (2007). Analysis of the vertebrate insulator protein ctcf-binding sites in the human genome.
    5. (2004). Applied bioinformatics for the identification of regulatory elements.
    6. (2006). CAGE Basic/Analysis Databases: the CAGE resource for comprehensive promoter analysis.
    7. (2003). Cap analysis gene expression for highthroughput analysis of transcriptional starting point and identification of promoter usage.
    8. (2007). Chromatin modifications and their function.
    9. (2003). Cluster-Buster: finding dense clusters of motifs in DNA sequences.
    10. (1998). Combining evidence using p-values: application to sequence homology searches.
    11. (2005). Core transcriptional regulatory circuitry in human embryonic stem cells.
    12. (2004). Detection of functional DNA motifs via statistical overrepresentation.
    13. (2005). Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes.
    14. (2004). Finding functional sequence elements by multiple local alignment.
    15. (2008). Genome-wide approaches to studying chromatin modifications.
    16. (2005). Genome-wide assembly and analysis of alternative transcripts in mouse.
    17. (2007). Genome-wide maps of chromatin state in pluripotent and lineage-committed cells.
    18. (2007). High-resolution profiling of histone methylations in the human genome.
    19. (2007). Identification and analysis of functional elements in 1% of the human genome by the encode pilot project.
    20. (2003). Identification of conserved regulatory elements by comparative genome analysis.
    21. (2007). Induction of pluripotent stem cells from adult human fibroblasts by defined factors.
    22. (2008). Integration of external signaling pathways with the core transcriptional network in embryonic stem cells.
    23. (2007). Integration of genome and chromatin structure with gene expression profiles to predict c-MYC recognition site binding and function.
    24. (1999). Klf4 is a transcription factor required for establishing the barrier function of the skin.
    25. (2006). MEME: discovering and analyzing DNA and protein sequence motifs.
    26. (1998). Methods and statistics for combining motif match scores.
    27. (2004). MONKEY: identifying conserved transcription-factor binding sites in multiple alignments using a binding site-specific evolutionary model.
    28. (2006). Myc-binding-site recognition in the human genome is determined by chromatin context.
    29. (2004). Phylogenetic motif detection by expectation-maximization on evolutionary mixtures.
    30. (2007). Reliable prediction of regulator targets using 12 Drosophila genomes.
    31. (2004). rVISTA 2.0: evolutionary analysis of transcription factor binding sites.
    32. (2003). Searching for statistically significant regulatory modules.
    33. (1990). Sequence logos: a new way to display consensus sequences.
    34. (2003). Statistical significance for genomewide studies.
    35. (2006). The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cells.
    36. (2006). The UCSC known genes.
    37. (2007). Tissue-specific transcriptional regulation has diverged significantly between human and mouse.
    38. (2004). Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs.
    39. (2006). Whole-genome comparison of Leu3 binding in vitro and in vivo reveals the importance of nucleosome occupancy in target site selection.

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.