Location of Repository

Consensus clustering and functional interpretation of gene-expression data

By S. Swift, A. Tucker, V. Vinciotti, Nigel Martin, C.A. Orengo, X. Liu and P. Kellam

Abstract

Microarray analysis using clustering algorithms can suffer from lack of inter-method consistency in assigning related gene-expression profiles to clusters. Obtaining a consensus set of clusters from a number of clustering methods should improve confidence in gene-expression analysis. Here we introduce consensus clustering, which provides such an advantage. When coupled with a statistically based gene functional analysis, our method allowed the identification of novel genes regulated by NFκB and the unfolded protein response in certain B-cell lymphomas

Topics: csis
Publisher: Springer
Year: 2004
OAI identifier: oai:eprints.bbk.ac.uk.oai2:2991

Suggested articles

Preview

Citations

  1. A: Comparing, contrasting and combining clusters in viral gene expression data.
  2. (2002). Altschuler SJ: Large-scale prediction of Saccharomyces cerevisiae gene function using overlapping transcriptional clusters. Nat Genet doi
  3. (2001). Analysis of temporal gene expression profiles: clustering by simulated annealing and determining the optimal number of clusters. Bioinformatics doi
  4. (2003). Bagging to improve the accuracy of a clustering procedure. Bioinformatics doi
  5. (2003). C: NFkappa B-dependent assembly of an enhanceosome-like complex on the promoter region of apoptosis inhibitor Bfl-1/A1. Mol Cell Biol doi
  6. (2001). Churchill GA: Bootstrapping cluster analysis: assessing the reliability of conclusions from microarray experiments. doi
  7. (1998). Cluster analysis and display of genome-wide expression patterns. doi
  8. (1999). Clustering gene expression patterns. doi
  9. (2003). Clustering geneexpression data with repeated measurements. Genome Biol
  10. (2003). Comparisons and validation of statistical clustering techniques for microarray gene expression data. Bioinformatics doi
  11. (2001). Computational analysis of microarray data. doi
  12. (2002). D: IRE1 couples endoplasmic reticulum load to secretory capacity by processing the XBP-1 mRNA. Nature doi
  13. (2003). DJ: Computationally identifying novel NF-kappa B-regulated immune genes in the human genome. Genome Res doi
  14. (2002). E: Statistical issues in the π(
  15. (1998). EB: NF-kappa B and Rel proteins: evolutionarily conserved mediators of immune responses. Annu Rev Immunol doi
  16. (2002). FP: Judging the quality of gene expressionbased clustering methods using gene annotation. Genome Res doi
  17. (1998). Genetic Algorithms and Grouping Problems doi
  18. (2000). Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell doi
  19. (1998). GJ: JPred: A consensus secondary structure prediction server. Bioinformatics doi
  20. (2001). Grouping multivariate time series via correlation. doi
  21. (2003). HGMP-Microarrays [http://www.hgmp.mrc.ac.uk/Research/ Microarray/HGMP-RC_Microarrays/ description_of_old_arrays.jsp#20]Genome Biology 2004, 5:R94 kappa B-regulated genes induced by TNFalpha utilizing expression profiling and RNA interference. Oncogene
  22. (1963). Hierarchical grouping to optimize an objective function. doi
  23. (1998). Identification of the cis-acting endoplasmic reticulum stress response element responsible for transcriptional induction of mammalian glucose-regulated proteins. Involvement of basic leucine zipper transcription factors. doi
  24. (2003). K: A time-dependent phase shift in the mammalian unfolded protein response. Dev Cell doi
  25. (2001). K: XBP1 mRNA is induced by ATF6 and spliced by IRE1 in response to ER stress to produce a highly active transcription factor. Cell doi
  26. (2003). Kaposi's sarcoma-associated herpesvirus-infected primary effusion lymphoma has a plasma cell gene expression profile. Proc Natl Acad Sci USA doi
  27. (1996). KC: BCM Search Launcher - an integrated interface to molecular biology data base search and analysis services available on the World Wide Web. Genome Res doi
  28. (1998). Kesselman C: The Grid: Blueprint for a New Computing Infrastructure doi
  29. (1999). Mammalian transcription factor ATF6 is synthesized as a transmembrane protein and activated by proteolysis in response to endoplasmic reticulum stress. Mol Biol Cell doi
  30. (2002). Methods for assessing reproducibility of clustering patterns observed in analyses of microarray data. Bioinformatics doi
  31. (2003). NS: Identification of NFkappa B-regulated genes induced by TNFalpha utilizing expression profiling and RNA interference. Oncogene doi
  32. (2002). NS: Identification of NFSG, Ron D: IRE1 couples endoplasmic reticulum load to secretory capacity by processing the XBP-1 mRNA. Nature
  33. (1997). Practical Statistics for Medical Research London: Chapman and Hall; doi
  34. (2003). RA: DAVID: database for annotation, visualization, and integrated discovery. Genome Biol
  35. (1999). Regulation of the dolichol pathway in human fibroblasts by the endoplasmic reticulum unfolded protein response. doi
  36. (2002). RJ: IRE1-mediated unconventional mRNA splicing and S2P-mediated ATF6 cleavage merge to regulate XBP1 in signaling the unfolded protein response. Genes Dev doi
  37. Rousseeuw PJ: Clustering by means of medoids. Statistical Analysis Based Upon the L1 Norm Edited by: Dodge Y.
  38. (1989). Self Organization and Associative Memory 3rd edition. doi
  39. Some methods for classification and analysis of multivariate observations.
  40. (2004). Some methods for classification and analysis of multivariate observations. 5th Berkeley Symposium on Mathematical Statistics and Probability Berkeley; 1967:281-297. x t A x t i ti i p
  41. (2004). Statistical issues in the xt Axt i t i i p () ( ) () , =− + = ∑ ε φ π µµ () () ()() x e xx n
  42. (1989). Statistical Methods 8th edition. doi
  43. (2003). T: Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Machine Learning
  44. (2002). The Ensembl genome database project. Nucleic Acids Res
  45. The R Project for Statistical Computing [http://www.rproject.org]
  46. (2001). Toh H: Statistical estimation of cluster boundaries in gene expression profile data. Bioinformatics doi
  47. (2002). towards a complete, object-oriented, human gene compendium. Bioinformatics doi
  48. (1983). Vecchi MP: Optimization by simulated annealing. Science doi
  49. (2001). WL: Validating clustering for gene expression data. Bioinformatics doi

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.