Search CORE

4 research outputs found

PRIORITY R-TREE WITH K-MODE FOR EFFICIENT CLUSTERING OF UNCERTAIN DATA

Author: Sathappan S.
Tomar Dr. D.C.
Publication venue: International Journal of Innovative Technology and Research
Publication date: 16/04/2015
Field of study

Uncertain data contains the specific uncertainty. Uncertain data is usually found in the area of sensor networks. To find the uncertain data is very expensive. Many of the algorithms have been proposed for handling the uncertain data such as k-means, uk means, global kernel k-means, u-rule and Fuzzy c-means. However, most of previous approaches try to cluster the dataset, whereas the overlap data is not well treated. In this paper, we propose two novel active learning algorithms: 1) k-mode for classifying the certain and uncertain dataset in a whole dataset, 2) Priority R-Tree clustering the certain and uncertain data for each domain. They handle both supervised and unsupervised dataset. These techniques improve the robustness and accuracy of the clustering outcome to a great extent. By minimizing the expected error with respect to the optimal classifier, experimental results display the cluster using the Gas sensor array drift Dataset

International Journal of Innovative Technology and Research (IJITR)

Uncertain centroid based partitional clustering of uncertain data

Author: Jain A.
Publication venue: 'VLDB Endowment'
Publication date
Field of study

Crossref

Metric and trigonometric pruning for clustering of uncertain data in 2D geometric space

Author: Chau M
Cheng R
Cheung DW
Kao B
Lee SD
Ngai WK
Yip KY
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

We study the problem of clustering data objects with location uncertainty. In our model, a data object is represented by an uncertainty region over which a probability density function (pdf) is defined. One method to cluster such uncertain objects is to apply the UK-means algorithm [1], an extension of the traditional K-means algorithm, which assigns each object to the cluster whose representative has the smallest expected distance from it. For arbitrary pdf, calculating the expected distance between an object and a cluster representative requires expensive integration of the pdf. We study two pruning methods: pre-computation (PC) and cluster shift (CS) that can significantly reduce the number of integrations computed. Both pruning methods rely on good bounding techniques. We propose and evaluate two such techniques that are based on metric properties (Met) and trigonometry (Tri). Our experimental results show that Tri offers a very high pruning power. In some cases, more than 99.9% of the expected distance calculations are pruned. This results in a very efficient clustering algorithm. 1. © 2010 Elsevier B.V. All rights reserved.link_to_subscribed_fulltex

HKU Scholars Hub