1 research outputs found

    Improvement of traditional k-means algorithm through the regulation of distance metric parameters

    No full text
    This paper discusses in detail the behavior of the basic k-means algorithm with four more new algorithms with varied distance measures on gene expression data. In data mining, k-means clustering is a method which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. The traditional k-means is one of the most popular clustering methods for analyzing gene expression data. However, it suffers from major shortcomings. It is sensitive to initial partitions and it is only applicable to data with spherical-shape clusters. The results of the present study show that the performances of the new algorithms are extremely well when compared to the traditional k-means and also emphasizes that through the regulation of distance metric parameters, one can achieve better clustering effects then the traditional k-means, and has an advantage in sensitivity, specificity and run time. Finally it is found that Canberra k-means performs extremely well. © 2013 IEEE
    corecore