2,294 research outputs found

    Macrostate Data Clustering

    Full text link
    We develop an effective nonhierarchical data clustering method using an analogy to the dynamic coarse graining of a stochastic system. Analyzing the eigensystem of an interitem transition matrix identifies fuzzy clusters corresponding to the metastable macroscopic states (macrostates) of a diffusive system. A "minimum uncertainty criterion" determines the linear transformation from eigenvectors to cluster-defining window functions. Eigenspectrum gap and cluster certainty conditions identify the proper number of clusters. The physically motivated fuzzy representation and associated uncertainty analysis distinguishes macrostate clustering from spectral partitioning methods. Macrostate data clustering solves a variety of test cases that challenge other methods.Comment: keywords: cluster analysis, clustering, pattern recognition, spectral graph theory, dynamic eigenvectors, machine learning, macrostates, classificatio

    On-line evolving fuzzy clustering

    Get PDF
    In this paper, a novel on-line evolving fuzzy clustering method that extends the evolving clustering method (ECM) of Kasabov and Song (2002) is presented, called EFCM. Since it is an on-line algorithm, the fuzzy membership matrix of the data is updated whenever the existing cluster expands, or a new cluster is formed. EFCM does not need the numbers of the clusters to be pre-defined. The algorithm is tested on several benchmark data sets, such as Iris, Wine, Glass, E-Coli, Yeast and Italian Olive oils. EFCM results in the least objective function value compared to the ECM and Fuzzy C-Means. It is significantly faster (by several orders of magnitude) than any of the off-line batch-mode clustering algorithms. A methodology is also proposed for using theXie-Beni cluster validity measure to optimize the number of clusters. © 2007 IEEE

    Fuzzy clustering and fuzzy c-means partition cluster analysis and validation studies on a subset of citescore dataset

    Get PDF
    A hard partition clustering algorithm assigns equally distant points to one of the clusters, where each datum has the probability to appear in simultaneous assignment to further clusters. The fuzzy cluster analysis assigns membership coefficients of data points which are equidistant between two clusters so the information directs have a place toward in excess of one cluster in the meantime. For a subset of CiteScore dataset, fuzzy clustering (fanny) and fuzzy c-means (fcm) algorithms were implemented to study the data points that lie equally distant from each other. Before analysis, clusterability of the dataset was evaluated with Hopkins statistic which resulted in 0.4371, a value < 0.5, indicating that the data is highly clusterable. The optimal clusters were determined using NbClust package, where it is evidenced that 9 various indices proposed 3 cluster solutions as best clusters. Further, appropriate value of fuzziness parameter m was evaluated to determine the distribution of membership values with variation in m from 1 to 2. Coefficient of variation (CV), also known as relative variability was evaluated to study the spread of data. The time complexity of fuzzy clustering (fanny) and fuzzy c-means algorithms were evaluated by keeping data points constant and varying number of clusters

    An analysis of possible applications of fuzzy set theory to the actuarial credibility theory

    Get PDF
    In this work, we review the basic concepts of actuarial credibility theory from the point of view of introducing applications of the fuzzy set-theoretic method. We show how the concept of actuarial credibility can be modeled through the fuzzy set membership functions and how fuzzy set methods, especially fuzzy pattern recognition, can provide an alternative tool for estimating credibility
    corecore