Ensemble based distributed k-harmonic means clustering

Abstract

Abstract—Due to the explosion in the number of autonomous data sources, there is a growing need for effective approaches for distributed knowledge discovery and data mining. The distributed clustering algorithm is used to cluster the distributed datasets without necessarily downloading all the data to a single site. K-Means is used as a popular clustering method due to its simplicity and high speed in clustering large datasets. The dependency of the K-Means performance on the initialization of centroids is a major problem. Similarly, distributed clustering algorithm based on K-Means is also sensitive to centroid initialization. It is demonstrated that K-Harmonic Means is essentially insensitive to centroid initialization. In this paper, a novel ensemble based distributed clustering algorithm using K-Harmonic Means is proposed. The simulated experiments described in this paper confirm robust performance of the proposed algorithm

Similar works

Full text

thumbnail-image
oai:CiteSeerX.psu:10.1.1.905.6109Last time updated on 11/1/2017

This paper was published in CiteSeerX.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.