21,042 research outputs found

    A new online clustering approach for data in arbitrary shaped clusters

    Get PDF
    In this paper we demonstrate a new density based clustering technique, CODAS, for online clustering of streaming data into arbitrary shaped clusters. CODAS is a two stage process using a simple local density to initiate micro-clusters which are then combined into clusters. Memory efficiency is gained by not storing or re-using any data. Computational efficiency is gained by using hyper-spherical micro-clusters to achieve a micro-cluster joining technique that is dimensionally independent for speed. The micro-clusters divide the data space in to sub-spaces with a core region and a non-core region. Core regions which intersect define the clusters. A threshold value is used to identify outlier micro-clusters separately from small clusters of unusual data. The cluster information is fully maintained on-line. In this paper we compare CODAS with ELM, DEC, Chameleon, DBScan and Denstream and demonstrate that CODAS achieves comparable results but in a fully on-line and dimensionally scale-able manner

    Multiorder neurons for evolutionary higher-order clustering and growth

    Get PDF
    This letter proposes to use multiorder neurons for clustering irregularly shaped data arrangements. Multiorder neurons are an evolutionary extension of the use of higher-order neurons in clustering. Higher-order neurons parametrically model complex neuron shapes by replacing the classic synaptic weight by higher-order tensors. The multiorder neuron goes one step further and eliminates two problems associated with higher-order neurons. First, it uses evolutionary algorithms to select the best neuron order for a given problem. Second, it obtains more information about the underlying data distribution by identifying the correct order for a given cluster of patterns. Empirically we observed that when the correlation of clusters found with ground truth information is used in measuring clustering accuracy, the proposed evolutionary multiorder neurons method can be shown to outperform other related clustering methods. The simulation results from the Iris, Wine, and Glass data sets show significant improvement when compared to the results obtained using self-organizing maps and higher-order neurons. The letter also proposes an intuitive model by which multiorder neurons can be grown, thereby determining the number of clusters in data

    Fuzzy Clustering for Image Segmentation Using Generic Shape Information

    Get PDF
    The performance of clustering algorithms for image segmentation are highly sensitive to the features used and types of objects in the image, which ultimately limits their generalization capability. This provides strong motivation to investigate integrating shape information into the clustering framework to improve the generality of these algorithms. Existing shape-based clustering techniques mainly focus on circular and elliptical clusters and so are unable to segment arbitrarily-shaped objects. To address this limitation, this paper presents a new shape-based algorithm called fuzzy clustering for image segmentation using generic shape information (FCGS), which exploits the B-spline representation of an object's shape in combination with the Gustafson-Kessel clustering algorithm. Qualitative and quantitative results for FCGS confirm its superior segmentation performance consistently compared to well-established shape-based clustering techniques, for a wide range of test images comprising various regular and arbitrary-shaped objects

    Cross-Entropy Clustering

    Full text link
    We construct a cross-entropy clustering (CEC) theory which finds the optimal number of clusters by automatically removing groups which carry no information. Moreover, our theory gives simple and efficient criterion to verify cluster validity. Although CEC can be build on an arbitrary family of densities, in the most important case of Gaussian CEC: {\em -- the division into clusters is affine invariant; -- the clustering will have the tendency to divide the data into ellipsoid-type shapes; -- the approach is computationally efficient as we can apply Hartigan approach.} We study also with particular attention clustering based on the Spherical Gaussian densities and that of Gaussian densities with covariance s \I. In the letter case we show that with ss converging to zero we obtain the classical k-means clustering
    • …
    corecore