Skip to main content
Article thumbnail
Location of Repository

Discovering transcriptional modules by Bayesian data integration\ud

By Richard S. Savage, Zoubin Ghahramani, Jim E. Griffin, Bernard J. De la Cruz and David L. Wild


Motivation: We present a method for directly inferring transcriptional modules (TMs) by integrating gene expression and transcription factor binding (ChIP-chip) data. Our model extends a hierarchical Dirichlet process mixture model to allow data fusion on a gene-by-gene basis. This encodes the intuition that co-expression and co-regulation are not necessarily equivalent and hence we do not expect all genes to group similarly in both datasets. In particular, it allows us to identify the subset of genes that share the same structure of transcriptional modules in both datasets. \ud \ud Results: We find that by working on a gene-by-gene basis, our model is able to extract clusters with greater functional coherence than existing methods. By combining gene expression and transcription factor binding (ChIP-chip) data in this way, we are better able to determine the groups of genes that are most likely to represent underlying TMs. \u

Topics: QA, QH
Publisher: Oxford University Press
Year: 2010
OAI identifier:

Suggested articles


  1. (2003a) Genome-wide discovery of transcriptional modules from DNA sequence and gene expression. doi
  2. (1973). A Bayesian analysis of some nonparametric problems. doi
  3. (2002). A Bayesian approach to modeling uncertainty in gene expression clusters.
  4. (1998). A genome-wide transcriptional analysis of the mitotic cell cycle. doi
  5. (2006). APPENDIX A A.1 THE ALGORITHM We can perform inference for this model using MCMC sampling, by extending the sampler in section 5.1 of (Teh
  6. (2007). Automated discovery of functional generality of human gene expression programs. doi
  7. (2007). Bayesian hierarchical model for transcriptional module discovery by jointly modeling gene expression and chip-chip data. doi
  8. (2002). Bayesian infinite mixture model based clustering of gene expression profiles. doi
  9. (2004). Bayesian mixture model based clustering of replicated microarray data. doi
  10. (2005). Cell-cycle control of gene expression in budding and fission yeast. doi
  11. (1998). Cluster analysis and display of genome-wide expression. doi
  12. (2003). Clustering gene-expression data with repeated measurements.
  13. (2006). Clustering microarray gene expression data using weighted Chinese restaurant process. doi
  14. (2005). Combining sequence and time series expression data to learn transcriptional modules. doi
  15. (2003). Computational discovery of gene modules and regulatory networks. doi
  16. (2006). Context-specific infinite mixtures for clustering gene expression profiles across diverse microarray dataset. doi
  17. (1992). Evaluating the accuracy of sampling-based approaches to calcualting posterior moments. In Bernardo,J.M. et al. (eds) Bayesian Statistics 4.
  18. (2008). Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient. doi
  19. (2000). Genomic expression programs in the response of yeast cells to environmental changes. doi
  20. (2010). Hierarchical Bayesian nonparametric models with applications. In Lid Hjort,N. et al. (eds), Bayesian Nonparametrics, doi
  21. (2006). Hierarchical Dirichlet processes. doi
  22. (2009). Improved criteria for clustering based on the posterior similarity matrix. doi
  23. (2001). Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. doi
  24. (2006). Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes.
  25. (1974). Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. doi
  26. (2006). Model-based clustering for expression data via a Dirichlet process mixture model. In doi
  27. (2009). Modeling and visualizing uncertainty in gene expression clusters using Dirichlet process mixtures. doi
  28. (2003). Module networks: Discovering regulatory modules and their condition specific regulators from gene expression data. doi
  29. (2009). R/BHC: fast Bayesian hierarchical clustering for microarray data. doi
  30. (2002). Revealing modular organization in the yeast transcriptional network. doi
  31. (2000). The infinite Gaussian mixture model.
  32. (2009). Transcriptional programs: modelling higher order structure in transcriptional control. doi
  33. (2004). Transcriptional regulatory code of a eukaryotic genome. doi
  34. (2002). Transcriptional regulatory networks in Saccharomyces cerevisiae. doi
  35. (2007). Using GOstats to test gene lists for GO term association. doi

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.