Search CORE

4 research outputs found

Vinayaka: A semi-supervised projected clustering method using differential evolution

Author: Durga Toshniwal
Satish Gajawada
Publication venue
Publication date: 01/01/2012
Field of study

ABSTRACT a semi-supervised projected clustering method based on DE. In this method DE optimizes a hybrid cluster validation index. Subspace Clustering Quality Estimate index (SCQE index) is used for internal cluster validation and Gini index gain is used for external cluster validation in the proposed hybrid cluster validation index. Proposed method is applied on Wisconsin breast cancer dataset

CiteSeerX

Document re-ranking using cluster validation and label propagation

Author: Donghong Ji
Guodong Zhou
Guozheng Xiao
Lingpeng Yang
Yu Nie
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2006
Field of study

This paper proposes a novel document re-ranking approach in information retrieval, which is done by a label propagation-based semi-supervised learning algorithm to utilize the intrinsic structure underlying in the large document data. Since no labeled relevant or irrelevant documents are generally available in IR, our approach tries to extract some pseudo labeled documents from the ranking list of the initial retrieval. For pseudo relevant documents, we determine a cluster of documents from the top ones via cluster validation-based k-means clustering; for pseudo irrelevant ones, we pick a set of documents from the bottom ones. Then the ranking of the documents can be conducted via label propagation. Evaluation on benchmark corpora shows that the approach can achieve significant improvement over standard baselines and performs better than other related approaches

CiteSeerX

Crossref

Document clustering based on cluster validation

Author: Ji D.-H.
Niu Z.-Y.
Tan C.-L.
Publication venue
Publication date: 01/01/2004
Field of study

International Conference on Information and Knowledge Management, Proceedings501-50

Crossref

ScholarBank@NUS

ABSTRACT Document Clustering Based on Cluster Validation

Author: Zheng-yu Niu
Publication venue
Publication date
Field of study

This paper presents a cluster validation based document clustering algorithm, which is capable of identifying both important feature words and true model order (cluster number). Important feature subset is selected by optimizing a cluster validity criterion subject to some constraint. For achieving model order identification capability, this feature selection procedure is conducted for each possible value of cluster number. The feature subset and cluster number which maximize the cluster validity criterion are chosen as our answer. We have applied our algorithm to several datasets from 20Newsgroup corpus. Experimental results show that our algorithm can find important feature subset, estimate the model order and yield higher micro-averaged precision than other four document clustering algorithms which require cluster number to be provided

CiteSeerX