12,489 research outputs found

    HIERARCHICAL CLUSTERING USING LEVEL SETS

    Get PDF
    Over the past several decades, clustering algorithms have earned their place as a go-to solution for database mining. This paper introduces a new concept which is used to develop a new recursive version of DBSCAN that can successfully perform hierarchical clustering, called Level- Set Clustering (LSC). A level-set is a subset of points of a data-set whose densities are greater than some threshold, ‘t’. By graphing the size of each level-set against its respective ‘t,’ indents are produced in the line graph which correspond to clusters in the data-set, as the points in a cluster have very similar densities. This new algorithm is able to produce the clustering result with the same O(n log n) time complexity as DBSCAN and OPTICS, while catching clusters the others missed

    A Survey on Soft Subspace Clustering

    Full text link
    Subspace clustering (SC) is a promising clustering technology to identify clusters based on their associations with subspaces in high dimensional spaces. SC can be classified into hard subspace clustering (HSC) and soft subspace clustering (SSC). While HSC algorithms have been extensively studied and well accepted by the scientific community, SSC algorithms are relatively new but gaining more attention in recent years due to better adaptability. In the paper, a comprehensive survey on existing SSC algorithms and the recent development are presented. The SSC algorithms are classified systematically into three main categories, namely, conventional SSC (CSSC), independent SSC (ISSC) and extended SSC (XSSC). The characteristics of these algorithms are highlighted and the potential future development of SSC is also discussed.Comment: This paper has been published in Information Sciences Journal in 201

    Cluster Evaluation of Density Based Subspace Clustering

    Full text link
    Clustering real world data often faced with curse of dimensionality, where real world data often consist of many dimensions. Multidimensional data clustering evaluation can be done through a density-based approach. Density approaches based on the paradigm introduced by DBSCAN clustering. In this approach, density of each object neighbours with MinPoints will be calculated. Cluster change will occur in accordance with changes in density of each object neighbours. The neighbours of each object typically determined using a distance function, for example the Euclidean distance. In this paper SUBCLU, FIRES and INSCY methods will be applied to clustering 6x1595 dimension synthetic datasets. IO Entropy, F1 Measure, coverage, accurate and time consumption used as evaluation performance parameters. Evaluation results showed SUBCLU method requires considerable time to process subspace clustering; however, its value coverage is better. Meanwhile INSCY method is better for accuracy comparing with two other methods, although consequence time calculation was longer.Comment: 6 pages, 15 figure

    Neural Collaborative Subspace Clustering

    Full text link
    We introduce the Neural Collaborative Subspace Clustering, a neural model that discovers clusters of data points drawn from a union of low-dimensional subspaces. In contrast to previous attempts, our model runs without the aid of spectral clustering. This makes our algorithm one of the kinds that can gracefully scale to large datasets. At its heart, our neural model benefits from a classifier which determines whether a pair of points lies on the same subspace or not. Essential to our model is the construction of two affinity matrices, one from the classifier and the other from a notion of subspace self-expressiveness, to supervise training in a collaborative scheme. We thoroughly assess and contrast the performance of our model against various state-of-the-art clustering algorithms including deep subspace-based ones.Comment: Accepted to ICML 201
    • 

    corecore