1,403 research outputs found

    A Survey on Soft Subspace Clustering

    Full text link
    Subspace clustering (SC) is a promising clustering technology to identify clusters based on their associations with subspaces in high dimensional spaces. SC can be classified into hard subspace clustering (HSC) and soft subspace clustering (SSC). While HSC algorithms have been extensively studied and well accepted by the scientific community, SSC algorithms are relatively new but gaining more attention in recent years due to better adaptability. In the paper, a comprehensive survey on existing SSC algorithms and the recent development are presented. The SSC algorithms are classified systematically into three main categories, namely, conventional SSC (CSSC), independent SSC (ISSC) and extended SSC (XSSC). The characteristics of these algorithms are highlighted and the potential future development of SSC is also discussed.Comment: This paper has been published in Information Sciences Journal in 201

    An Intelligent Decision Support System for Leukaemia Diagnosis using Microscopic Blood Images

    Get PDF
    This research proposes an intelligent decision support system for acute lymphoblastic leukaemia diagnosis from microscopic blood images. A novel clustering algorithm with stimulating discriminant measures (SDM) of both within- and between-cluster scatter variances is proposed to produce robust segmentation of nucleus and cytoplasm of lymphocytes/lymphoblasts. Specifically, the proposed between-cluster evaluation is formulated based on the trade-off of several between-cluster measures of well-known feature extraction methods. The SDM measures are used in conjuction with Genetic Algorithm for clustering nucleus, cytoplasm, and background regions. Subsequently, a total of eighty features consisting of shape, texture, and colour information of the nucleus and cytoplasm subimages are extracted. A number of classifiers (multi-layer perceptron, Support Vector Machine (SVM) and Dempster-Shafer ensemble) are employed for lymphocyte/lymphoblast classification. Evaluated with the ALL-IDB2 database, the proposed SDM-based clustering overcomes the shortcomings of Fuzzy C-means which focuses purely on within-cluster scatter variance. It also outperforms Linear Discriminant Analysis and Fuzzy Compactness and Separation for nucleus-cytoplasm separation. The overall system achieves superior recognition rates of 96.72% and 96.67% accuracies using bootstrapping and 10-fold cross validation with Dempster-Shafer and SVM, respectively. The results also compare favourably with those reported in the literature, indicating the usefulness of the proposed SDM-based clustering method

    clusterBMA: Bayesian model averaging for clustering

    Full text link
    Various methods have been developed to combine inference across multiple sets of results for unsupervised clustering, within the ensemble clustering literature. The approach of reporting results from one `best' model out of several candidate clustering models generally ignores the uncertainty that arises from model selection, and results in inferences that are sensitive to the particular model and parameters chosen. Bayesian model averaging (BMA) is a popular approach for combining results across multiple models that offers some attractive benefits in this setting, including probabilistic interpretation of the combined cluster structure and quantification of model-based uncertainty. In this work we introduce clusterBMA, a method that enables weighted model averaging across results from multiple unsupervised clustering algorithms. We use clustering internal validation criteria to develop an approximation of the posterior model probability, used for weighting the results from each model. From a consensus matrix representing a weighted average of the clustering solutions across models, we apply symmetric simplex matrix factorisation to calculate final probabilistic cluster allocations. In addition to outperforming other ensemble clustering methods on simulated data, clusterBMA offers unique features including probabilistic allocation to averaged clusters, combining allocation probabilities from 'hard' and 'soft' clustering algorithms, and measuring model-based uncertainty in averaged cluster allocation. This method is implemented in an accompanying R package of the same name

    Clustering of Steel Strip Sectional Profiles Based on Robust Adaptive Fuzzy Clustering Algorithm

    Get PDF
    In this paper, the intelligent techniques are applied to enhance the quality control precision in the steel strip cold rolling production. Firstly a new control scheme is proposed, establishing the classifier of the steel strip cross-sectional profiles is the core of the system. The fuzzy clustering algorithm is used to establish the classifier. Secondly, a novel fuzzy clustering algorithm is proposed and used in the real application. The results, under the comparisons with the results obtained by the conventional fuzzy clustering algorithm, show the new algorithm is robust and efficient and it can not only get better clustering prototypes, which are used as the classifier, but also easily and effectively detect the outliers; it does great help in improving the performances of the new system. Finally, it is pointed out that the new algorithm's efficiency is mainly due to the introduction of a set of adaptive operators which allow for treating the different influences of data objects on the clustering operations; and in nature, the new fuzzy algorithm is the generalized version of the existing fuzzy clustering algorithm

    Fuzzy feature evaluation index and connectionist realization

    Get PDF
    A new feature evaluation index based on fuzzy set theory and a connectionist model for its evaluation are provided. A concept of flexible membership function incorporating weighting factors, is introduced which makes the modeling of the class structures more appropriate. A neuro-fuzzy algorithm is developed for determining the optimum weighting coefficients representing the feature importance. The overall importance of the features is evaluated both individually and in a group considering their dependence as well as independence. Effectiveness of the algorithms along with comparison is demonstrated on speech and Iris data
    corecore