5,891 research outputs found
Techniques for clustering gene expression data
Many clustering techniques have been proposed for the analysis of gene expression data obtained from microarray experiments. However, choice of suitable method(s) for a given experimental dataset is not straightforward. Common approaches do not translate well and fail to take account of the data profile. This review paper surveys state of the art applications which recognises these limitations and implements procedures to overcome them. It provides a framework for the evaluation of clustering in gene expression analyses. The nature of microarray data is discussed briefly. Selected examples are presented for the clustering methods considered
The Law of Large Numbers in a Metric Space with a Convex Combination Operation
We consider a separable complete metric space equipped with a convex combination operation. For such spaces, we identify the corresponding convexification operator and show that the invariant elements for this operator appear naturally as limits in the strong law of large numbers. It is shown how to uplift the suggested construction to work with subsets of the basic space in order to develop a systematic way of proving laws of large numbers for such operations with random set
A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm
K-means is undoubtedly the most widely used partitional clustering algorithm.
Unfortunately, due to its gradient descent nature, this algorithm is highly
sensitive to the initial placement of the cluster centers. Numerous
initialization methods have been proposed to address this problem. In this
paper, we first present an overview of these methods with an emphasis on their
computational efficiency. We then compare eight commonly used linear time
complexity initialization methods on a large and diverse collection of data
sets using various performance criteria. Finally, we analyze the experimental
results using non-parametric statistical tests and provide recommendations for
practitioners. We demonstrate that popular initialization methods often perform
poorly and that there are in fact strong alternatives to these methods.Comment: 17 pages, 1 figure, 7 table
- …