Article thumbnail

Combinatorial chromatin modification patterns in the human genome revealed by subspace clustering

By Duygu Ucar, Qingyang Hu and Kai Tan


Chromatin modifications, such as post-translational modification of histone proteins and incorporation of histone variants, play an important role in regulating gene expression. Joint analyses of multiple histone modification maps are starting to reveal combinatorial patterns of modifications that are associated with functional DNA elements, providing support to the ‘histone code’ hypothesis. However, due to the lack of analytical methods, only a small number of chromatin modification patterns have been discovered so far. Here, we introduce a scalable subspace clustering algorithm, coherent and shifted bicluster identification (CoSBI), to exhaustively identify the set of combinatorial modification patterns across a given epigenome. Performance comparisons demonstrate that CoSBI can generate biclusters with higher intra-cluster coherency and biological relevance. We apply our algorithm to a compendium of 39 genome-wide chromatin modification maps in human CD4+ T cells. We identify 843 combinatorial patterns that recur at >0.1% of the genome. A total of 19 chromatin modifications are observed in the combinatorial patterns, 10 of which occur in more than half of the patterns. We also identify combinatorial modification signatures for eight classes of functional DNA elements. Application of CoSBI to epigenome maps of different cells and developmental stages will aid in understanding how chromatin structure helps regulate gene expression

Topics: Gene Regulation, Chromatin and Epigenetics
Publisher: Oxford University Press
OAI identifier:
Provided by: PubMed Central

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.

Suggested articles


  1. (2006). A bivalent chromatin structure marks key developmental genes in embryonic stem cells.
  2. (2006). A systematic comparison and evaluation of biclustering methods for gene expression data.
  3. (2008). Bayesian biclustering of gene expression data.
  4. (2004). Biclustering algorithms for biological data analysis: a survey. IEEE/Acm Trans.
  5. (2000). Biclustering of expression data.
  6. (2009). CBP-mediated acetylation of histone H3 lysine 27 antagonizes Drosophila Polycomb silencing.
  7. (2007). Chemical derivatization of histones for facilitated analysis by mass spectrometry.
  8. (2008). ChromaSig: a probabilistic approach to finding common chromatin signatures in the human genome.
  9. (2009). Chromatin poises miRNA- and protein-coding genes for expression.
  10. (2008). Combinatorial modification of human histone H4 quantitated by two-dimensional liquid chromatography coupled with top down mass spectrometry.
  11. (2008). Combinatorial patterns of histone acetylations and methylations in the human genome.
  12. (2009). CpG islands–‘a rough guide’.
  13. (2004). Data structure for association rule mining: T-trees and P-trees.
  14. (2002). Discovering statistically significant biclusters in gene expression data.
  15. (2009). Discovery and annotation of functional chromatin signatures in the human genome.
  16. (2010). Discovery and characterization of chromatin states for systematic annotation of the human genome.
  17. (2007). Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome.
  18. (2005). DNA methylation and histone modifications: teaming up to silence genes.
  19. (2008). Dynamic regulation of nucleosome positioning in the human genome.
  20. (2007). EDISA: extracting biclusters from multiple time-series of gene expression profiles.
  21. (2008). Genome-wide approaches to studying chromatin modifications.
  22. (2009). Genome-wide mapping of HATs and HDACs reveals distinct functions in active and inactive genes.
  23. (2007). Genome-wide prediction of conserved and nonconserved enhancers by histone acetylation patterns.
  24. (2009). Global analysis of the insulator binding protein CTCF in chromatin barrier regions reveals demarcation of active and repressive domains.
  25. (2009). Heterochromatin protein 1 is extensively decorated with histone code-like post-translational modifications.
  26. (2007). High-resolution profiling of histone methylations in the human genome.
  27. (2010). Histone modification levels are predictive for gene expression.
  28. (2009). Histone modifications at human enhancers reflect global cell-type-specific gene expression.
  29. (2005). Histone variant H2A.Z marks the 5’ ends of both active and inactive genes in euchromatin.
  30. (2007). Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.
  31. (2003). Identification of novel histone post-translational modifications by peptide mass fingerprinting.
  32. (2003). Iterative signature algorithm for the analysis of large-scale gene expression data.
  33. (2009). Many human large intergenic noncoding RNAs associate with chromatin-modifying complexes and affect gene expression.
  34. (2008). Mass spectrometry identifies and quantifies 74 unique histone H4 isoforms in differentiating human embryonic stem cells.
  35. (2004). Mining coherent gene clusters from gene-sample-time microarray data.
  36. (2008). Model-based analysis of ChIP-seq (MACS).
  37. (2007). Pervasive combinatorial modification of histone H3 in human cells.
  38. (2008). Prediction of regulatory elements in mammalian genomes using chromatin signatures.
  39. (2009). Predictive chromatin signatures in the mammalian genome.
  40. (2007). PReMod: a database of genome-wide mammalian cis-regulatory module predictions.
  41. (2009). PRMT5-mediated methylation of histone H4R3 recruits DNMT3A, coupling histone and DNA methylation in gene silencing.
  42. (2009). Spatial clustering of multivariate genomic and epigenomic information.
  43. (2004). Subspace clustering for high dimensional data: a review.
  44. (2003). Targeted recruitment of Set1 histone methylase by elongating Pol II provides a localized mark and memory of recent transcriptional activity.
  45. (1973). The Art of Computer Programming.
  46. (2002). The barrier function of an insulator couples high histone acetylation levels with specific protection of promoter DNA from methylation.
  47. (2005). The diverse functions of histone lysine methylation.
  48. (1976). The generalized correlation method for estimation of time delay. Acoustics, Speech Signal Proc.
  49. (2000). The language of covalent histone modifications.
  50. (2007). The mammalian epigenome.
  51. (2005). Tricluster: an effective algorithm for mining coherent clusters in 3d microarray data.
  52. (2009). Unlocking the secrets of the genome.