Skip to main content
Article thumbnail
Location of Repository

Clustering of Expression Data from Microarrays: a Mixture-Based Approach

By Francesco Bartolucci and Francesca Chiaromonte

Abstract

A central aim of many statistical analyses of microarray data is to cluster genes according to their similarity in expression behavior. In this paper, we perform clustering based on the likelihood fit of a multivariate normal mixture. This approach has several advantages with respect to standard partitioning or hierarchical algorithms; it has an unambiguous inferential characterization, it produces soft partitions through membership probabilities, it allows one to model component mean vectors and covariance structures, and to manage anomalous and missing observations in a natural way. In particular, our mixture-based approach allows us to (i) model component mean vectors through linear reparameterizations, (ii) model component covariance structures through constraints on a special decomposition, (iii) handle outliers through the introduction of a contamination term (uniform on the hypervolume of the data), and (iv) impute missing values. The maximum likelihood estimation of parameters and membership probabilities, and the imputation of missing values, is accomplished through the EM algorithm. Concerning model selection, we employ the classical Bayesian Information Criterion, pragmatically combined with consideration of other features, such as overall membership strength, within-cluster dispersion, and weight of the contamination term. To illustrate our approach, we analyze publicly available data on the reaction of yeast cells to heat shocks. The results of our analysis suggest two alternative clustering models, which provide two different and interesting interpretations of the structure in the data

Topics: Clustering, EM algorithm, Maximum Likelihood Estimation, Microarray data, Multivari
Year: 2002
OAI identifier: oai:CiteSeerX.psu:10.1.1.18.7533
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://globin.cse.psu.edu/cour... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.