2,265 research outputs found
A correlation-based fuzzy cluster validity index with secondary options detector
The optimal number of clusters is one of the main concerns when applying
cluster analysis. Several cluster validity indexes have been introduced to
address this problem. However, in some situations, there is more than one
option that can be chosen as the final number of clusters. This aspect has been
overlooked by most of the existing works in this area. In this study, we
introduce a correlation-based fuzzy cluster validity index known as the
Wiroonsri-Preedasawakul (WP) index. This index is defined based on the
correlation between the actual distance between a pair of data points and the
distance between adjusted centroids with respect to that pair. We evaluate and
compare the performance of our index with several existing indexes, including
Xie-Beni, Pakhira-Bandyopadhyay-Maulik, Tang, Wu-Li, generalized C, and Kwon2.
We conduct this evaluation on four types of datasets: artificial datasets,
real-world datasets, simulated datasets with ranks, and image datasets, using
the fuzzy c-means algorithm. Overall, the WP index outperforms most, if not
all, of these indexes in terms of accurately detecting the optimal number of
clusters and providing accurate secondary options. Moreover, our index remains
effective even when the fuzziness parameter is set to a large value. Our R
package called WPfuzzyCVIs used in this work is also available in
https://github.com/nwiroonsri/WPfuzzyCVIs.Comment: 19 page
A Survey of Adaptive Resonance Theory Neural Network Models for Engineering Applications
This survey samples from the ever-growing family of adaptive resonance theory
(ART) neural network models used to perform the three primary machine learning
modalities, namely, unsupervised, supervised and reinforcement learning. It
comprises a representative list from classic to modern ART models, thereby
painting a general picture of the architectures developed by researchers over
the past 30 years. The learning dynamics of these ART models are briefly
described, and their distinctive characteristics such as code representation,
long-term memory and corresponding geometric interpretation are discussed.
Useful engineering properties of ART (speed, configurability, explainability,
parallelization and hardware implementation) are examined along with current
challenges. Finally, a compilation of online software libraries is provided. It
is expected that this overview will be helpful to new and seasoned ART
researchers
Evidential relational clustering using medoids
In real clustering applications, proximity data, in which only pairwise
similarities or dissimilarities are known, is more general than object data, in
which each pattern is described explicitly by a list of attributes.
Medoid-based clustering algorithms, which assume the prototypes of classes are
objects, are of great value for partitioning relational data sets. In this
paper a new prototype-based clustering method, named Evidential C-Medoids
(ECMdd), which is an extension of Fuzzy C-Medoids (FCMdd) on the theoretical
framework of belief functions is proposed. In ECMdd, medoids are utilized as
the prototypes to represent the detected classes, including specific classes
and imprecise classes. Specific classes are for the data which are distinctly
far from the prototypes of other classes, while imprecise classes accept the
objects that may be close to the prototypes of more than one class. This soft
decision mechanism could make the clustering results more cautious and reduce
the misclassification rates. Experiments in synthetic and real data sets are
used to illustrate the performance of ECMdd. The results show that ECMdd could
capture well the uncertainty in the internal data structure. Moreover, it is
more robust to the initializations compared with FCMdd.Comment: in The 18th International Conference on Information Fusion, July
2015, Washington, DC, USA , Jul 2015, Washington, United State
Observer-biased bearing condition monitoring: from fault detection to multi-fault classification
Bearings are simultaneously a fundamental component and one of the principal causes of failure in rotary machinery. The work focuses on the employment of fuzzy clustering for bearing condition monitoring, i.e., fault detection and classification. The output of a clustering algorithm is a data partition (a set of clusters) which is merely a hypothesis on the structure of the data. This hypothesis requires validation by domain experts. In general, clustering algorithms allow a limited usage of domain knowledge on the cluster formation process. In this study, a novel method allowing for interactive clustering in bearing fault diagnosis is proposed. The method resorts to shrinkage to generalize an otherwise unbiased clustering algorithm into a biased one. In this way, the method provides a natural and intuitive way to control the cluster formation process, allowing for the employment of domain knowledge to guiding it. The domain expert can select a desirable level of granularity ranging from fault detection to classification of a variable number of faults and can select a specific region of the feature space for detailed analysis. Moreover, experimental results under realistic conditions show that the adopted algorithm outperforms the corresponding unbiased algorithm (fuzzy c-means) which is being widely used in this type of problems. (C) 2016 Elsevier Ltd. All rights reserved.Grant number: 145602
Segmentation of articular cartilage and early osteoarthritis based on the fuzzy soft thresholding approach driven by modified evolutionary ABC optimization and local statistical aggregation
Articular cartilage assessment, with the aim of the cartilage loss identification, is a crucial task for the clinical practice of orthopedics. Conventional software (SW) instruments allow for just a visualization of the knee structure, without post processing, offering objective cartilage modeling. In this paper, we propose the multiregional segmentation method, having ambitions to bring a mathematical model reflecting the physiological cartilage morphological structure and spots, corresponding with the early cartilage loss, which is poorly recognizable by the naked eye from magnetic resonance imaging (MRI). The proposed segmentation model is composed from two pixel's classification parts. Firstly, the image histogram is decomposed by using a sequence of the triangular fuzzy membership functions, when their localization is driven by the modified artificial bee colony (ABC) optimization algorithm, utilizing a random sequence of considered solutions based on the real cartilage features. In the second part of the segmentation model, the original pixel's membership in a respective segmentation class may be modified by using the local statistical aggregation, taking into account the spatial relationships regarding adjacent pixels. By this way, the image noise and artefacts, which are commonly presented in the MR images, may be identified and eliminated. This fact makes the model robust and sensitive with regards to distorting signals. We analyzed the proposed model on the 2D spatial MR image records. We show different MR clinical cases for the articular cartilage segmentation, with identification of the cartilage loss. In the final part of the analysis, we compared our model performance against the selected conventional methods in application on the MR image records being corrupted by additive image noise.Web of Science117art. no. 86
Clustering of nonstationary data streams: a survey of fuzzy partitional methods
YesData streams have arisen as a relevant research topic during the past decade. They are real‐time, incremental in nature, temporally ordered, massive, contain outliers, and the objects in a data stream may evolve over time (concept drift). Clustering is often one of the earliest and most important steps in the streaming data analysis workflow. A comprehensive literature is available about stream data clustering; however, less attention is devoted to the fuzzy clustering approach, even though the nonstationary nature of many data streams makes it especially appealing. This survey discusses relevant data stream clustering algorithms focusing mainly on fuzzy methods, including their treatment of outliers and concept drift and shift.Ministero dell‘Istruzione, dell‘Universitá e della Ricerca
Applying subclustering and Lp distance in Weighted K-Means with distributed centroids
We consider the Weighted K-Means algorithm with distributed centroids aimed at clustering data sets with numerical, categorical and mixed types of data. Our approach allows given features (i.e., variables) to have different weights at different clusters. Thus, it supports the intuitive idea that features may have different degrees of relevance at different clusters. We use the Minkowski metric in a way that feature weights become feature re-scaling factors for any considered exponent. Moreover, the traditional Silhouette clustering validity index was adapted to deal with both numerical and categorical types of features. Finally, we show that our new method usually outperforms traditional K-Means as well as the recently proposed WK-DC clustering algorithm.Peer reviewe
- …