unknown

Study of subspace clustering algorithm of high dimensional data based on variable weighting methods

Abstract

高维数据的稀疏性和“维灾“问题使得多数传统聚类算法失去作用,因此研究高维数据集的聚类算法己成为当前的一个热点。子空间聚类算法是实现高维数据集聚类的有效方法之一。介绍并实现了基于可变加权的高维数据子空间聚类算法SCAd和EWkM,并分别对人造数据、现实数据等数据集进行测试,根据测试结果进行分析,对比两种算法的性能及适用场合。The sparsity and the problem of the curse of dimensionality of high-dimensional data, make the most of traditional clustering algorithms lose their action in high-dimensional space.Therefore, clustering of data in a high-dimensional space becomes a hot research area.Subspace clustering algorithm is one of the effective ways to handle problems of high-dimensional data clustering.This paper introduces and realizes two algorithms (SCAD and EWKM) that discover clusters in subspaces spanned by different combinations of dimensions via local weightings of features.We experiment these algorithms using synthetic datasets and real datasets, then analyze the results and contrast their performance and applicable occasions

    Similar works