2,265 research outputs found

    Cluster validity in clustering methods

    Get PDF

    A correlation-based fuzzy cluster validity index with secondary options detector

    Full text link
    The optimal number of clusters is one of the main concerns when applying cluster analysis. Several cluster validity indexes have been introduced to address this problem. However, in some situations, there is more than one option that can be chosen as the final number of clusters. This aspect has been overlooked by most of the existing works in this area. In this study, we introduce a correlation-based fuzzy cluster validity index known as the Wiroonsri-Preedasawakul (WP) index. This index is defined based on the correlation between the actual distance between a pair of data points and the distance between adjusted centroids with respect to that pair. We evaluate and compare the performance of our index with several existing indexes, including Xie-Beni, Pakhira-Bandyopadhyay-Maulik, Tang, Wu-Li, generalized C, and Kwon2. We conduct this evaluation on four types of datasets: artificial datasets, real-world datasets, simulated datasets with ranks, and image datasets, using the fuzzy c-means algorithm. Overall, the WP index outperforms most, if not all, of these indexes in terms of accurately detecting the optimal number of clusters and providing accurate secondary options. Moreover, our index remains effective even when the fuzziness parameter mm is set to a large value. Our R package called WPfuzzyCVIs used in this work is also available in https://github.com/nwiroonsri/WPfuzzyCVIs.Comment: 19 page

    A Survey of Adaptive Resonance Theory Neural Network Models for Engineering Applications

    Full text link
    This survey samples from the ever-growing family of adaptive resonance theory (ART) neural network models used to perform the three primary machine learning modalities, namely, unsupervised, supervised and reinforcement learning. It comprises a representative list from classic to modern ART models, thereby painting a general picture of the architectures developed by researchers over the past 30 years. The learning dynamics of these ART models are briefly described, and their distinctive characteristics such as code representation, long-term memory and corresponding geometric interpretation are discussed. Useful engineering properties of ART (speed, configurability, explainability, parallelization and hardware implementation) are examined along with current challenges. Finally, a compilation of online software libraries is provided. It is expected that this overview will be helpful to new and seasoned ART researchers

    Evidential relational clustering using medoids

    Get PDF
    In real clustering applications, proximity data, in which only pairwise similarities or dissimilarities are known, is more general than object data, in which each pattern is described explicitly by a list of attributes. Medoid-based clustering algorithms, which assume the prototypes of classes are objects, are of great value for partitioning relational data sets. In this paper a new prototype-based clustering method, named Evidential C-Medoids (ECMdd), which is an extension of Fuzzy C-Medoids (FCMdd) on the theoretical framework of belief functions is proposed. In ECMdd, medoids are utilized as the prototypes to represent the detected classes, including specific classes and imprecise classes. Specific classes are for the data which are distinctly far from the prototypes of other classes, while imprecise classes accept the objects that may be close to the prototypes of more than one class. This soft decision mechanism could make the clustering results more cautious and reduce the misclassification rates. Experiments in synthetic and real data sets are used to illustrate the performance of ECMdd. The results show that ECMdd could capture well the uncertainty in the internal data structure. Moreover, it is more robust to the initializations compared with FCMdd.Comment: in The 18th International Conference on Information Fusion, July 2015, Washington, DC, USA , Jul 2015, Washington, United State

    Observer-biased bearing condition monitoring: from fault detection to multi-fault classification

    Get PDF
    Bearings are simultaneously a fundamental component and one of the principal causes of failure in rotary machinery. The work focuses on the employment of fuzzy clustering for bearing condition monitoring, i.e., fault detection and classification. The output of a clustering algorithm is a data partition (a set of clusters) which is merely a hypothesis on the structure of the data. This hypothesis requires validation by domain experts. In general, clustering algorithms allow a limited usage of domain knowledge on the cluster formation process. In this study, a novel method allowing for interactive clustering in bearing fault diagnosis is proposed. The method resorts to shrinkage to generalize an otherwise unbiased clustering algorithm into a biased one. In this way, the method provides a natural and intuitive way to control the cluster formation process, allowing for the employment of domain knowledge to guiding it. The domain expert can select a desirable level of granularity ranging from fault detection to classification of a variable number of faults and can select a specific region of the feature space for detailed analysis. Moreover, experimental results under realistic conditions show that the adopted algorithm outperforms the corresponding unbiased algorithm (fuzzy c-means) which is being widely used in this type of problems. (C) 2016 Elsevier Ltd. All rights reserved.Grant number: 145602

    Segmentation of articular cartilage and early osteoarthritis based on the fuzzy soft thresholding approach driven by modified evolutionary ABC optimization and local statistical aggregation

    Get PDF
    Articular cartilage assessment, with the aim of the cartilage loss identification, is a crucial task for the clinical practice of orthopedics. Conventional software (SW) instruments allow for just a visualization of the knee structure, without post processing, offering objective cartilage modeling. In this paper, we propose the multiregional segmentation method, having ambitions to bring a mathematical model reflecting the physiological cartilage morphological structure and spots, corresponding with the early cartilage loss, which is poorly recognizable by the naked eye from magnetic resonance imaging (MRI). The proposed segmentation model is composed from two pixel's classification parts. Firstly, the image histogram is decomposed by using a sequence of the triangular fuzzy membership functions, when their localization is driven by the modified artificial bee colony (ABC) optimization algorithm, utilizing a random sequence of considered solutions based on the real cartilage features. In the second part of the segmentation model, the original pixel's membership in a respective segmentation class may be modified by using the local statistical aggregation, taking into account the spatial relationships regarding adjacent pixels. By this way, the image noise and artefacts, which are commonly presented in the MR images, may be identified and eliminated. This fact makes the model robust and sensitive with regards to distorting signals. We analyzed the proposed model on the 2D spatial MR image records. We show different MR clinical cases for the articular cartilage segmentation, with identification of the cartilage loss. In the final part of the analysis, we compared our model performance against the selected conventional methods in application on the MR image records being corrupted by additive image noise.Web of Science117art. no. 86

    Neutrosophic Fuzzy Hierarchical Clustering for Dengue Analysis in Sri Lanka

    Get PDF

    Clustering of nonstationary data streams: a survey of fuzzy partitional methods

    Get PDF
    YesData streams have arisen as a relevant research topic during the past decade. They are real‐time, incremental in nature, temporally ordered, massive, contain outliers, and the objects in a data stream may evolve over time (concept drift). Clustering is often one of the earliest and most important steps in the streaming data analysis workflow. A comprehensive literature is available about stream data clustering; however, less attention is devoted to the fuzzy clustering approach, even though the nonstationary nature of many data streams makes it especially appealing. This survey discusses relevant data stream clustering algorithms focusing mainly on fuzzy methods, including their treatment of outliers and concept drift and shift.Ministero dell‘Istruzione, dell‘Universitá e della Ricerca

    Applying subclustering and Lp distance in Weighted K-Means with distributed centroids

    Get PDF
    We consider the Weighted K-Means algorithm with distributed centroids aimed at clustering data sets with numerical, categorical and mixed types of data. Our approach allows given features (i.e., variables) to have different weights at different clusters. Thus, it supports the intuitive idea that features may have different degrees of relevance at different clusters. We use the Minkowski metric in a way that feature weights become feature re-scaling factors for any considered exponent. Moreover, the traditional Silhouette clustering validity index was adapted to deal with both numerical and categorical types of features. Finally, we show that our new method usually outperforms traditional K-Means as well as the recently proposed WK-DC clustering algorithm.Peer reviewe
    corecore