257 research outputs found

    A survey of kernel and spectral methods for clustering

    Get PDF
    Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a survey of kernel and spectral clustering methods, two approaches able to produce nonlinear separating hypersurfaces between clusters. The presented kernel clustering methods are the kernel version of many classical clustering algorithms, e.g., K-means, SOM and neural gas. Spectral clustering arise from concepts in spectral graph theory and the clustering problem is configured as a graph cut problem where an appropriate objective function has to be optimized. An explicit proof of the fact that these two paradigms have the same objective is reported since it has been proven that these two seemingly different approaches have the same mathematical foundation. Besides, fuzzy kernel clustering methods are presented as extensions of kernel K-means clustering algorithm. (C) 2007 Pattem Recognition Society. Published by Elsevier Ltd. All rights reserved

    Image Segmentation Using Ant System-based Clustering Algorithm

    Get PDF
    Industrial applications of computer vision sometimes require detection of atypical objects that occur as small groups of pixels in digital images. These objects are difficult to single out because they are small and randomly distributed. In this work we propose an image segmentation method using the novel Ant System-based Clustering Algorithm (ASCA). ASCA models the foraging behaviour of ants, which move through the data space searching for high data-density regions, and leave pheromone trails on their path. The pheromone map is used to identify the exact number of clusters, and assign the pixels to these clusters using the pheromone gradient. We applied ASCA to detection of microcalcifications in digital mammograms and compared its performance with state-of-the-art clustering algorithms such as 1D Self-Organizing Map, k-Means, Fuzzy c-Means and Possibilistic Fuzzy c-Means. The main advantage of ASCA is that the number of clusters needs not to be known a priori. The experimental results show that ASCA is more efficient than the other algorithms in detecting small clusters of atypical data

    General fuzzy min-max neural network for clustering and classification

    Get PDF
    This paper describes a general fuzzy min-max (GFMM) neural network which is a generalization and extension of the fuzzy min-max clustering and classification algorithms of Simpson (1992, 1993). The GFMM method combines supervised and unsupervised learning in a single training algorithm. The fusion of clustering and classification resulted in an algorithm that can be used as pure clustering, pure classification, or hybrid clustering classification. It exhibits a property of finding decision boundaries between classes while clustering patterns that cannot be said to belong to any of existing classes. Similarly to the original algorithms, the hyperbox fuzzy sets are used as a representation of clusters and classes. Learning is usually completed in a few passes and consists of placing and adjusting the hyperboxes in the pattern space; this is an expansion-contraction process. The classification results can be crisp or fuzzy. New data can be included without the need for retraining. While retaining all the interesting features of the original algorithms, a number of modifications to their definition have been made in order to accommodate fuzzy input patterns in the form of lower and upper bounds, combine the supervised and unsupervised learning, and improve the effectiveness of operations. A detailed account of the GFMM neural network, its comparison with the Simpson's fuzzy min-max neural networks, a set of examples, and an application to the leakage detection and identification in water distribution systems are given

    Subspace Clustering: A Possibilistic Approach

    Get PDF
    Ως συσταδοποίηση υποχώρων ορίζεται το πρόβλημα της μοντελοποίησης δεδομένων που βρίσκονται σε έναν ή και περισσότερους υποχώρους υπό την παρουσία θορύβου και περιέχουν ακραίες παρατηρήσεις και ελλιπή δεδομένα. Εξ όσων γνωρίζουμε, όλοι οι αλγόριθμοι που επιλύουν αυτό το πρόβλημα υποθέτουν ότι μια παρατήρηση ανήκει αυστηρά σε έναν υποχώρο. Η παρούσα διατριβή εξετάζει την περίπτωση όπου ένα σημείο μπορεί ταυτόχρονα και ανεξάρτητα να ανήκει σε παραπάνω από έναν υποχώρο. Ως αποτέλεσμα έχουμε την δημιουργία ενός καινούργιου αλγορίθμου, του sparse adaptive possibilistic K-subspaces (SAP K-subspaces). Ο αλγόριθμος αυτός αποτελεί γενίκευση του αλγορίθμου sparse possibilistic c-means algorithm (SPCM) [2], πράγμα που σημαίνει ότι μπορεί να διαχειριστεί με αξιοπιστία δεδομένα τόσο με θόρυβο και ακραίες τιμές όσο και δεδομένα τα οποία βρίσκονται σε τομές υποχώρων. Επίσης, ο καινούργιος αλγόριθμος αρχικοποιείται με περισσότερες συστάδες από τις πραγματικές, έχοντας την δυνατότητα απαλοιφής των περιττών συστάδων και τελικά την εύρεση αυτών που σχηματίζονται απο τα δεδομένα. Επιπλέον, υιοθετεί μια προσέγγιση εύρεσης γινομένου πινάκων χαμηλής τάξης για την εκτίμηση της διάστασης των υποχώρων [1]. Πειράματα σε συνθετικά και αληθινά δεδομένα επιβεβαιώνουν την αποτελεσματικότητα του αλγορίθμου. [1] Paris V Giampouras, Athanasios A Rontogiannis, and Konstantinos D Koutroumbas. Alternating iteratively reweighted least squares minimization for lowrank matrix factorization. IEEE Transactions on Signal Processing, 67(2):490–503, 2018. [2] Spyridoula D Xenaki, Konstantinos D Koutroumbas, and Athanasios A Rontogiannis. Sparsityaware possibilistic clustering algorithms. IEEE Transactions on Fuzzy Systems, 24(6):1611–1626, 2016.Subspace clustering is the problem of modeling a collection of data points lying in one or more subspaces in the presence of noise, outliers and missing data. To the best of our knowledge, all the algorithms associated to this problem follow a hard clustering philosophy. The study presented in this thesis explores the effectiveness of the possibilistic approach, giving rise to a novel iterative algorithm, called sparse adaptive possibilistic K- subspaces (SAP K-subspaces). SAP K-subspaces algorithm generalizes the sparse possibilistic c-means algorithm (SPCM) [2]. Hence, it inherits the ability to handle reliably data corrupted by noise and containing outliers, as well as data points near the intersections of subspaces. In addition, the new algorithm is suitably initialized with more clusters than those actually exist in the data set and has the ability to gradually eliminate the unnecessary ones in order to conclude with the true clusters, formed by the data. Moreover, it adopts the low-rank approach, introduced in [1], in order to estimate the dimension of the involved subspaces. Experiments on both synthetic and real data illustrate the effectiveness of the proposed method. [1] Paris V Giampouras, Athanasios A Rontogiannis, and Konstantinos D Koutroumbas. Alternating iteratively reweighted least squares minimization for lowrank matrix factorization. IEEE Transactions on Signal Processing, 67(2):490–503, 2018. [2] Spyridoula D Xenaki, Konstantinos D Koutroumbas, and Athanasios A Rontogiannis. Sparsityaware possibilistic clustering algorithms. IEEE Transactions on Fuzzy Systems, 24(6):1611–1626, 2016

    Advances in transfer learning methods based on computational intelligence

    Get PDF
    Traditional machine learning and data mining have made tremendous progress in many knowledge-based areas, such as clustering, classification, and regression. However, the primary assumption in all of these areas is that the training and testing data should be in the same domain and have the same distribution. This assumption is difficult to achieve in real-world applications due to the limited availability of labeled data. Associated data in different domains can be used to expand the availability of prior knowledge about future target data. In recent years, transfer learning has been used to address such cross-domain learning problems by using information from data in a related domain and transferring that data to the target task. The transfer learning methodology is utilized in this work with unsupervised and supervised learning methods. For unsupervised learning, a novel transfer-learning possibilistic c-means (TLPCM) algorithm is proposed to handle the PCM clustering problem in a domain that has insufficient data. Moreover, TLPCM overcomes the problem of differing numbers of clusters between the source and target domains. The proposed algorithm employs the historical cluster centers of the source data as a reference to guide the clustering of the target data. The experimental studies presented here were thoroughly evaluated, and they demonstrate the advantages of TLPCM in both synthetic and real-world transfer datasets. For supervised learning, a transfer learning (TL) technique is used to pre-train a CNN model on posture data and then fine-tune it on the sleep stage data. We used a ballistocardiography (BCG) bed sensor to collect both posture and sleep stage data to provide a non-invasive, in-home monitoring system that tracks changes in the subjects' health over time. The quality of sleep has a significant impact on health and life. This study adopts a hierarchical and none-hierarchical classification structure to develop an automatic sleep stage classification system using ballistocardiogram (BCG) signals. A leave-one-subject-out cross-validation (LOSO-CV) procedure is used for testing classification performance in most of the experiments. Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM), and Deep Neural Networks DNNs are complementary in their modeling capabilities, while CNNs have the advantage of reducing frequency variations, LSTMs are good at temporal modeling. Polysomnography (PSG) data from a sleep lab was used as the ground truth for sleep stages, with the emphasis on three sleep stages, specifically, awake, rapid eye movement (REM), and non-REM sleep (NREM). Moreover, a transfer learning approach is employed with supervised learning to address the cross-resident training problem to predict early illness. We validate our method by conducting a retrospective study on three residents from TigerPlace, a retirement community in Columbia, MO, where apartments are fitted with wireless networks of motion and bed sensors. Predicting the early signs of illness in older adults by using a continuous, unobtrusive nursing home monitoring system has been shown to increase the quality of life and decrease care costs. Illness prediction is based on sensor data and uses algorithms such as support vector machine (SVM) and k-nearest neighbors (kNN). One of the most significant challenges related to the development of prediction algorithms for sensor networks is the use of knowledge from previous residents to predict new ones' behaviors. Each day, the presence or absence of illness was manually evaluated using nursing visit reports from a homegrown electronic medical record (EMR) system. In this work, the transfer learning SVM approach outperformed three other methods, i.e., regular SVM, one-class SVM, and one-class kNN.Includes bibliographical references (pages 114-127)
    corecore