Clustering Via Normal Mixture Models

Abstract

We consider a model-based approach to clustering, whereby each observation is assumed to have arisen from an underlying mixture of a finite number of distributions. The number of components in this mixture model corresponds to the number of clusters to be imposed on the data. A common assumption is to take the component distributions to be multivariate normal with perhaps some restrictions on the component covariance matrices. The model can be fitted to the data using maximum likelihood implemented via the EM algorithm. There is a number of computational issues associated with the fitting, including the specification of initial starting points for the EM algorithm and the carrying out of tests for the number of components in the final version of the model. We shall discuss some of these problems and describe an algorithm that attempts to handle them automatically. 1. INTRODUCTION In some applications of mixture models, questions related to clustering may arise only after the mixture mo..

    Similar works

    Full text

    thumbnail-image

    Available Versions