3,567 research outputs found

    Similarity-Driven Cluster Merging Method for Unsupervised Fuzzy Clustering

    Get PDF
    In this paper, a similarity-driven cluster merging method is proposed for unsupervised fuzzy clustering. The cluster merging method is used to resolve the problem of cluster validation. Starting with an overspecified number of clusters in the data, pairs of similar clusters are merged based on the proposed similarity-driven cluster merging criterion. The similarity between clusters is calculated by a fuzzy cluster similarity matrix, while an adaptive threshold is used for merging. In addition, a modified generalized objective function is used for prototype-based fuzzy clustering. The function includes the p-norm distance measure as well as principal components of the clusters. The number of the principal components is determined automatically from the data being clustered. The performance of this unsupervised fuzzy clustering algorithm is evaluated by several experiments of an artificial data set and a gene expression data set.Singapore-MIT Alliance (SMA

    Fuzzy clustering with volume prototypes and adaptive cluster merging

    Get PDF
    Two extensions to the objective function-based fuzzy clustering are proposed. First, the (point) prototypes are extended to hypervolumes, whose size can be fixed or can be determined automatically from the data being clustered. It is shown that clustering with hypervolume prototypes can be formulated as the minimization of an objective function. Second, a heuristic cluster merging step is introduced where the similarity among the clusters is assessed during optimization. Starting with an overestimation of the number of clusters in the data, similar clusters are merged in order to obtain a suitable partitioning. An adaptive threshold for merging is proposed. The extensions proposed are applied to Gustafson–Kessel and fuzzy c-means algorithms, and the resulting extended algorithm is given. The properties of the new algorithm are illustrated by various examples

    Extended Fuzzy Clustering Algorithms

    Get PDF
    Fuzzy clustering is a widely applied method for obtaining fuzzy models from data. Ithas been applied successfully in various fields including finance and marketing. Despitethe successful applications, there are a number of issues that must be dealt with in practicalapplications of fuzzy clustering algorithms. This technical report proposes two extensionsto the objective function based fuzzy clustering for dealing with these issues. First, the(point) prototypes are extended to hypervolumes whose size is determined automaticallyfrom the data being clustered. These prototypes are shown to be less sensitive to a biasin the distribution of the data. Second, cluster merging by assessing the similarity amongthe clusters during optimization is introduced. Starting with an over-estimated number ofclusters in the data, similar clusters are merged during clustering in order to obtain a suitablepartitioning of the data. An adaptive threshold for merging is introduced. The proposedextensions are applied to Gustafson-Kessel and fuzzy c-means algorithms, and the resultingextended algorithms are given. The properties of the new algorithms are illustrated invarious examples.fuzzy clustering;cluster merging;similarity;volume prototypes

    Evolving Clustering Algorithms And Their Application For Condition Monitoring, Diagnostics, & Prognostics

    Get PDF
    Applications of Condition-Based Maintenance (CBM) technology requires effective yet generic data driven methods capable of carrying out diagnostics and prognostics tasks without detailed domain knowledge and human intervention. Improved system availability, operational safety, and enhanced logistics and supply chain performance could be achieved, with the widespread deployment of CBM, at a lower cost level. This dissertation focuses on the development of a Mutual Information based Recursive Gustafson-Kessel-Like (MIRGKL) clustering algorithm which operates recursively to identify underlying model structure and parameters from stream type data. Inspired by the Evolving Gustafson-Kessel-like Clustering (eGKL) algorithm, we applied the notion of mutual information to the well-known Mahalanobis distance as the governing similarity measure throughout. This is also a special case of the Kullback-Leibler (KL) Divergence where between-cluster shape information (governed by the determinant and trace of the covariance matrix) is omitted and is only applicable in the case of normally distributed data. In the cluster assignment and consolidation process, we proposed the use of the Chi-square statistic with the provision of having different probability thresholds. Due to the symmetry and boundedness property brought in by the mutual information formulation, we have shown with real-world data that the algorithm’s performance becomes less sensitive to the same range of probability thresholds which makes system tuning a simpler task in practice. As a result, improvement demonstrated by the proposed algorithm has implications in improving generic data driven methods for diagnostics, prognostics, generic function approximations and knowledge extractions for stream type of data. The work in this dissertation demonstrates MIRGKL’s effectiveness in clustering and knowledge representation and shows promising results in diagnostics and prognostics applications

    Some Clustering Methods, Algorithms and their Applications

    Get PDF
    Clustering is a type of unsupervised learning [15]. When no target values are known, or "supervisors," in an unsupervised learning task, the purpose is to produce training data from the inputs themselves. Data mining and machine learning would be useless without clustering. If you utilize it to categorize your datasets according to their similarities, you'll be able to predict user behavior more accurately. The purpose of this research is to compare and contrast three widely-used data-clustering methods. Clustering techniques include partitioning, hierarchy, density, grid, and fuzzy clustering. Machine learning, data mining, pattern recognition, image analysis, and bioinformatics are just a few of the many fields where clustering is utilized as an analytical technique. In addition to defining the various algorithms, specialized forms of cluster analysis, linking methods, and please offer a review of the clustering techniques used in the big data setting

    A hierarchical Mamdani-type fuzzy modelling approach with new training data selection and multi-objective optimisation mechanisms: A special application for the prediction of mechanical properties of alloy steels

    Get PDF
    In this paper, a systematic data-driven fuzzy modelling methodology is proposed, which allows to construct Mamdani fuzzy models considering both accuracy (precision) and transparency (interpretability) of fuzzy systems. The new methodology employs a fast hierarchical clustering algorithm to generate an initial fuzzy model efficiently; a training data selection mechanism is developed to identify appropriate and efficient data as learning samples; a high-performance Particle Swarm Optimisation (PSO) based multi-objective optimisation mechanism is developed to further improve the fuzzy model in terms of both the structure and the parameters; and a new tolerance analysis method is proposed to derive the confidence bands relating to the final elicited models. This proposed modelling approach is evaluated using two benchmark problems and is shown to outperform other modelling approaches. Furthermore, the proposed approach is successfully applied to complex high-dimensional modelling problems for manufacturing of alloy steels, using ‘real’ industrial data. These problems concern the prediction of the mechanical properties of alloy steels by correlating them with the heat treatment process conditions as well as the weight percentages of the chemical compositions

    Techniques for clustering gene expression data

    Get PDF
    Many clustering techniques have been proposed for the analysis of gene expression data obtained from microarray experiments. However, choice of suitable method(s) for a given experimental dataset is not straightforward. Common approaches do not translate well and fail to take account of the data profile. This review paper surveys state of the art applications which recognises these limitations and implements procedures to overcome them. It provides a framework for the evaluation of clustering in gene expression analyses. The nature of microarray data is discussed briefly. Selected examples are presented for the clustering methods considered
    corecore