113 research outputs found

    Dealing with non-metric dissimilarities in fuzzy central clustering algorithms

    Get PDF
    Clustering is the problem of grouping objects on the basis of a similarity measure among them. Relational clustering methods can be employed when a feature-based representation of the objects is not available, and their description is given in terms of pairwise (dis)similarities. This paper focuses on the relational duals of fuzzy central clustering algorithms, and their application in situations when patterns are represented by means of non-metric pairwise dissimilarities. Symmetrization and shift operations have been proposed to transform the dissimilarities among patterns from non-metric to metric. In this paper, we analyze how four popular fuzzy central clustering algorithms are affected by such transformations. The main contributions include the lack of invariance to shift operations, as well as the invariance to symmetrization. Moreover, we highlight the connections between relational duals of central clustering algorithms and central clustering algorithms in kernel-induced spaces. One among the presented algorithms has never been proposed for non-metric relational clustering, and turns out to be very robust to shift operations. (C) 2008 Elsevier Inc. All rights reserved

    Certainty of outlier and boundary points processing in data mining

    Full text link
    Data certainty is one of the issues in the real-world applications which is caused by unwanted noise in data. Recently, more attentions have been paid to overcome this problem. We proposed a new method based on neutrosophic set (NS) theory to detect boundary and outlier points as challenging points in clustering methods. Generally, firstly, a certainty value is assigned to data points based on the proposed definition in NS. Then, certainty set is presented for the proposed cost function in NS domain by considering a set of main clusters and noise cluster. After that, the proposed cost function is minimized by gradient descent method. Data points are clustered based on their membership degrees. Outlier points are assigned to noise cluster and boundary points are assigned to main clusters with almost same membership degrees. To show the effectiveness of the proposed method, two types of datasets including 3 datasets in Scatter type and 4 datasets in UCI type are used. Results demonstrate that the proposed cost function handles boundary and outlier points with more accurate membership degrees and outperforms existing state of the art clustering methods.Comment: Conference Paper, 6 page

    Advances in transfer learning methods based on computational intelligence

    Get PDF
    Traditional machine learning and data mining have made tremendous progress in many knowledge-based areas, such as clustering, classification, and regression. However, the primary assumption in all of these areas is that the training and testing data should be in the same domain and have the same distribution. This assumption is difficult to achieve in real-world applications due to the limited availability of labeled data. Associated data in different domains can be used to expand the availability of prior knowledge about future target data. In recent years, transfer learning has been used to address such cross-domain learning problems by using information from data in a related domain and transferring that data to the target task. The transfer learning methodology is utilized in this work with unsupervised and supervised learning methods. For unsupervised learning, a novel transfer-learning possibilistic c-means (TLPCM) algorithm is proposed to handle the PCM clustering problem in a domain that has insufficient data. Moreover, TLPCM overcomes the problem of differing numbers of clusters between the source and target domains. The proposed algorithm employs the historical cluster centers of the source data as a reference to guide the clustering of the target data. The experimental studies presented here were thoroughly evaluated, and they demonstrate the advantages of TLPCM in both synthetic and real-world transfer datasets. For supervised learning, a transfer learning (TL) technique is used to pre-train a CNN model on posture data and then fine-tune it on the sleep stage data. We used a ballistocardiography (BCG) bed sensor to collect both posture and sleep stage data to provide a non-invasive, in-home monitoring system that tracks changes in the subjects' health over time. The quality of sleep has a significant impact on health and life. This study adopts a hierarchical and none-hierarchical classification structure to develop an automatic sleep stage classification system using ballistocardiogram (BCG) signals. A leave-one-subject-out cross-validation (LOSO-CV) procedure is used for testing classification performance in most of the experiments. Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM), and Deep Neural Networks DNNs are complementary in their modeling capabilities, while CNNs have the advantage of reducing frequency variations, LSTMs are good at temporal modeling. Polysomnography (PSG) data from a sleep lab was used as the ground truth for sleep stages, with the emphasis on three sleep stages, specifically, awake, rapid eye movement (REM), and non-REM sleep (NREM). Moreover, a transfer learning approach is employed with supervised learning to address the cross-resident training problem to predict early illness. We validate our method by conducting a retrospective study on three residents from TigerPlace, a retirement community in Columbia, MO, where apartments are fitted with wireless networks of motion and bed sensors. Predicting the early signs of illness in older adults by using a continuous, unobtrusive nursing home monitoring system has been shown to increase the quality of life and decrease care costs. Illness prediction is based on sensor data and uses algorithms such as support vector machine (SVM) and k-nearest neighbors (kNN). One of the most significant challenges related to the development of prediction algorithms for sensor networks is the use of knowledge from previous residents to predict new ones' behaviors. Each day, the presence or absence of illness was manually evaluated using nursing visit reports from a homegrown electronic medical record (EMR) system. In this work, the transfer learning SVM approach outperformed three other methods, i.e., regular SVM, one-class SVM, and one-class kNN.Includes bibliographical references (pages 114-127)

    Assimilation of Standard Regularizer Contextual Model and Composite Kernel with Fuzzy-based Noise Classifier

    Get PDF
    The paper assay the effect of assimilating smoothness prior contextual model and composite kernel function with fuzzy based noise classifier using remote sensing data. The concept of the composite kernel has been taken by fusing two kernels together to improve the classification accuracy. Gaussian and Sigmoid kernel functions have opted for kernel composition. As a contextual model, Markov Random Field (MRF) Standard regularization model (smoothness prior) has been studied with the composite kernel-based Noise Classifier. Comparative analysis of new classifier with the conventional construes increase in overall accuracy

    Residual-Sparse Fuzzy CC-Means Clustering Incorporating Morphological Reconstruction and Wavelet frames

    Full text link
    Instead of directly utilizing an observed image including some outliers, noise or intensity inhomogeneity, the use of its ideal value (e.g. noise-free image) has a favorable impact on clustering. Hence, the accurate estimation of the residual (e.g. unknown noise) between the observed image and its ideal value is an important task. To do so, we propose an 0\ell_0 regularization-based Fuzzy CC-Means (FCM) algorithm incorporating a morphological reconstruction operation and a tight wavelet frame transform. To achieve a sound trade-off between detail preservation and noise suppression, morphological reconstruction is used to filter an observed image. By combining the observed and filtered images, a weighted sum image is generated. Since a tight wavelet frame system has sparse representations of an image, it is employed to decompose the weighted sum image, thus forming its corresponding feature set. Taking it as data for clustering, we present an improved FCM algorithm by imposing an 0\ell_0 regularization term on the residual between the feature set and its ideal value, which implies that the favorable estimation of the residual is obtained and the ideal value participates in clustering. Spatial information is also introduced into clustering since it is naturally encountered in image segmentation. Furthermore, it makes the estimation of the residual more reliable. To further enhance the segmentation effects of the improved FCM algorithm, we also employ the morphological reconstruction to smoothen the labels generated by clustering. Finally, based on the prototypes and smoothed labels, the segmented image is reconstructed by using a tight wavelet frame reconstruction operation. Experimental results reported for synthetic, medical, and color images show that the proposed algorithm is effective and efficient, and outperforms other algorithms.Comment: 12 pages, 11 figur

    Hyperspectral Unmixing Overview: Geometrical, Statistical, and Sparse Regression-Based Approaches

    Get PDF
    Imaging spectrometers measure electromagnetic energy scattered in their instantaneous field view in hundreds or thousands of spectral channels with higher spectral resolution than multispectral cameras. Imaging spectrometers are therefore often referred to as hyperspectral cameras (HSCs). Higher spectral resolution enables material identification via spectroscopic analysis, which facilitates countless applications that require identifying materials in scenarios unsuitable for classical spectroscopic analysis. Due to low spatial resolution of HSCs, microscopic material mixing, and multiple scattering, spectra measured by HSCs are mixtures of spectra of materials in a scene. Thus, accurate estimation requires unmixing. Pixels are assumed to be mixtures of a few materials, called endmembers. Unmixing involves estimating all or some of: the number of endmembers, their spectral signatures, and their abundances at each pixel. Unmixing is a challenging, ill-posed inverse problem because of model inaccuracies, observation noise, environmental conditions, endmember variability, and data set size. Researchers have devised and investigated many models searching for robust, stable, tractable, and accurate unmixing algorithms. This paper presents an overview of unmixing methods from the time of Keshava and Mustard's unmixing tutorial [1] to the present. Mixing models are first discussed. Signal-subspace, geometrical, statistical, sparsity-based, and spatial-contextual unmixing algorithms are described. Mathematical problems and potential solutions are described. Algorithm characteristics are illustrated experimentally.Comment: This work has been accepted for publication in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensin

    A hybrid interval type-2 semi-supervised possibilistic fuzzy c-means clustering and particle swarm optimization for satellite image analysis

    Get PDF
    Although satellite images can provide more information about the earth’s surface in a relatively short time and over a large scale, they are affected by observation conditions and the accuracy of the image acquisition equipment. The objects on the images are often not clear and uncertain, especially at their borders. The type-1 fuzzy set based fuzzy clustering technique allows each data pattern to belong to many different clusters through membership function (MF) values, which can handle data patterns with unclear and uncertain boundaries well. However, this technique is quite sensitive to noise, outliers, and limitations in handling uncertainties. To overcome these disadvantages, we propose a hybrid method encompassing interval type-2 semi-supervised possibilistic fuzzy c-means clustering (IT2SPFCM) and Particle Swarm Optimization (PSO) to form the proposed IT2SPFCM-PSO. We experimented on some satellite images to prove the effectiveness of the proposed method. Experimental results show that the IT2SPFCM-PSO algorithm gives accuracy from 98.8% to 99.39% and is higher than that of other matching algorithms including SFCM, SMKFCM, SIIT2FCM, PFCM, SPFCM-W, SPFCM-SS, and IT2SPFCM. Analysis of the results by indicators PC-I, CE-I, D-I, XB-I, t -I, and MSE also showed that the proposed method gives better results in most experiments

    Informational Paradigm, management of uncertainty and theoretical formalisms in the clustering framework: A review

    Get PDF
    Fifty years have gone by since the publication of the first paper on clustering based on fuzzy sets theory. In 1965, L.A. Zadeh had published “Fuzzy Sets” [335]. After only one year, the first effects of this seminal paper began to emerge, with the pioneering paper on clustering by Bellman, Kalaba, Zadeh [33], in which they proposed a prototypal of clustering algorithm based on the fuzzy sets theory

    A Novel Fuzzy c -Means Clustering Algorithm Using Adaptive Norm

    Get PDF
    Abstract(#br)The fuzzy c -means (FCM) clustering algorithm is an unsupervised learning method that has been widely applied to cluster unlabeled data automatically instead of artificially, but is sensitive to noisy observations due to its inappropriate treatment of noise in the data. In this paper, a novel method considering noise intelligently based on the existing FCM approach, called adaptive-FCM and its extended version (adaptive-REFCM) in combination with relative entropy, are proposed. Adaptive-FCM, relying on an inventive integration of the adaptive norm, benefits from a robust overall structure. Adaptive-REFCM further integrates the properties of the relative entropy and normalized distance to preserve the global details of the dataset. Several experiments are carried out,..
    corecore