320 research outputs found

    Possibilistic and fuzzy clustering methods for robust analysis of non-precise data

    Get PDF
    This work focuses on robust clustering of data affected by imprecision. The imprecision is managed in terms of fuzzy sets. The clustering process is based on the fuzzy and possibilistic approaches. In both approaches the observations are assigned to the clusters by means of membership degrees. In fuzzy clustering the membership degrees express the degrees of sharing of the observations to the clusters. In contrast, in possibilistic clustering the membership degrees are degrees of typicality. These two sources of information are complementary because the former helps to discover the best fuzzy partition of the observations while the latter reflects how well the observations are described by the centroids and, therefore, is helpful to identify outliers. First, a fully possibilistic k-means clustering procedure is suggested. Then, in order to exploit the benefits of both the approaches, a joint possibilistic and fuzzy clustering method for fuzzy data is proposed. A selection procedure for choosing the parameters of the new clustering method is introduced. The effectiveness of the proposal is investigated by means of simulated and real-life data

    Air pollution Analysis with a PFCM Clustering Algorithm Applied in a Real Database of Salamanca (Mexico)

    Get PDF
    Over the last ten years, Salamanca has been considered among the most polluted cities in México. Nowadays, there is an Automatic Environmental Monitoring Network (AEMN) which measures air pollutants (Sulphur Dioxide (SO2), Particular Matter (PM10), Ozone (O3), etc.), as well as environmental variables (wind speed, wind direction, temperature, and relative humidity), and it takes a sample of the variables every minute. The AEM Network is mainly based on three monitoring stations located at Cruz Roja, DIF, and Nativitas. In this work, we use the PFCM (Possibilistic Fuzzy c Means) clustering algorithm as a mean to get a combined measure, from the three stations, looking to provide a tool for better management of contingencies in the city, such that local or general action can be taken in the city according to the pollution level given by each station and the combined measure. Besides, we also performed an analysis of correlation between pollution and environmental variables. The results show a significative correlation between pollutant concentrations and some environmental variables. So, the combined measure and the correlations can be used for the establishment of general contingency thresholds

    An Efficient Fuzzy Possibilistic C-Means with Penalized and Compensated Constraints

    Get PDF
    Improvement in sensing and storage devices and impressive growth in applications such as Internet search, digital imaging, and video surveillance have generated many high-volume, high-dimensional data. The raise in both the quantity and the kind of data requires improvement in techniques to understand, process and summarize the data. Categorizing data into reasonable groupings is one of the most essential techniques for understanding and learning. This is performed with the help of technique called clustering. This clustering technique is widely helpful in fields such as pattern recognition, image processing, and data analysis. The commonly used clustering technique is K-Means clustering. But this clustering results in misclassification when large data are involved in clustering. To overcome this disadvantage, Fuzzy- Possibilistic C-Means (FPCM) algorithm can be used for clustering. FPCM combines the advantages of Possibilistic C-Means (PCM) algorithm and fuzzy logic. For further improving the performance of clustering, penalized and compensated constraints are used in this paper. Penalized and compensated terms are embedded with the modified fuzzy possibilistic clustering method2019;s objective function to construct the clustering with enhanced performance. The experimental result illustrates the enhanced performance of the proposed clustering technique when compared to the fuzzy possibilistic c-means clustering algorithm

    Kümeleme yöntemlerinde BCO, OCO, BOCO ve OBCO algoritmalarının karşılaştırılması

    Get PDF
    Clustering is a process of dividing the objects into subgroups so that the same set of data is similar, but the data of different clusters is different. The basis of the fuzzy clustering algorithms is the C- Means families and the strongest algorithm is the Fuzzy C-means (FCM) algorithm. In this study; FCM, Possibilistic Fuzzy C-means (PFCM), Fuzzy Possibilistic C-means (FPCM) and Possibilistic C- means (PCM) algorithms are used to classify the several real data sets which are E.coli, wine and seed data sets into different clusters by MATLAB program. Also, the results of PFCM, FPCM, PCM and FCM algorithms are compared according to the classification accuracy, root mean squared error (RMSE) and mean absolute error (MAE). The results show that the PFCM and FPCM algorithms have better performance than FCM and PCM according to criteria for comparing the performances.Kümeleme, nesneleri özelliklerine göre kümelere bölme işlemidir, böylece aynı veri kümesi benzerdir, farklı kümelerin verileri farklıdır. Bulanık kümeleme algoritmalarının temeli C- ortalamalar aileleridir ve en güçlü algoritma Bulanık C- ortalamalar (BCO) algoritmasıdır. Bu çalışmada; BCO, Olabilirlikli Bulanık C-ortalamalar (OBCO), Bulanık Olabilirlikli C-ortalamalar (BOCO) ve Olabilirlikli C- ortalamalar (OCO) algoritmaları, E.koli, şarap ve tohum veri setleri olarak ifade edilen birkaç gerçek veri setini farklı kümeler halinde sınıflandırmak için MATLAB programı vasıtasıyla kullanılmıştır. Ayrıca, OBCO, BOCO ve OCO ve BCO algoritmaları sonuçları sınıflandırma doğruluğuna, hata kareler ortalamasının karekökü (HKOK) ve ortalama mutlak hata (OMH) değerlerine göre karşılaştırılmıştır. Deney sonuçları, performans karşılaştırmada kullanılan kriterlere göre OBCO ve BOCO algoritmalarının BCO ve OCO algoritmalarından daha iyi performansa sahip olduğunu göstermektedir

    Identification of pore spaces in 3D CT soil images using a PFCM partitional clustering

    Get PDF
    Recent advances in non-destructive imaging techniques, such as X-ray computed tomography (CT), make it possible to analyse pore space features from the direct visualisation from soil structures. A quantitative characterisation of the three-dimensional solid-pore architecture is important to understand soil mechanics, as they relate to the control of biological, chemical, and physical processes across scales. This analysis technique therefore offers an opportunity to better interpret soil strata, as new and relevant information can be obtained. In this work, we propose an approach to automatically identify the pore structure of a set of 200-2D images that represent slices of an original 3D CT image of a soil sample, which can be accomplished through non-linear enhancement of the pixel grey levels and an image segmentation based on a PFCM (Possibilistic Fuzzy C-Means) algorithm. Once the solids and pore spaces have been identified, the set of 200-2D images is then used to reconstruct an approximation of the soil sample by projecting only the pore spaces. This reconstruction shows the structure of the soil and its pores, which become more bounded, less bounded, or unbounded with changes in depth. If the soil sample image quality is sufficiently favourable in terms of contrast, noise and sharpness, the pore identification is less complicated, and the PFCM clustering algorithm can be used without additional processing; otherwise, images require pre-processing before using this algorithm. Promising results were obtained with four soil samples, the first of which was used to show the algorithm validity and the additional three were used to demonstrate the robustness of our proposal. The methodology we present here can better detect the solid soil and pore spaces on CT images, enabling the generation of better 2D?3D representations of pore structures from segmented 2D images

    Early detection of health changes in the elderly using in-home multi-sensor data streams

    Get PDF
    The rapid aging of the population worldwide requires increased attention from health care providers and the entire society. For the elderly to live independently, many health issues related to old age, such as frailty and risk of falling, need increased attention and monitoring. When monitoring daily routines for older adults, it is desirable to detect the early signs of health changes before serious health events, such as hospitalizations, happen, so that timely and adequate preventive care may be provided. By deploying multi-sensor systems in homes of the elderly, we can track trajectories of daily behaviors in a feature space defined using the sensor data. In this work, we investigate a methodology for learning data distribution from streaming data and tracking the evolution of the behavior trajectories over long periods (years) using high dimensional streaming clustering and provide very early indicators of changes in health. If we assume that habitual behaviors correspond to clusters in feature space and diseases produce a change in behavior, albeit not highly specific, tracking trajectory deviations can provide hints of early illness. Retrospectively, we visualize the streaming clustering results and track how the behavior clusters evolve in feature space with the help of two dimension-reduction algorithms, Principal Component Analysis (PCA) and t-distributed Stochastic Neighbor Embedding (t-SNE). Moreover, our tracking algorithm in the original high dimensional feature space generates early health warning alerts if a negative trend is detected in the behavior trajectory. We validated our algorithm on synthetic data, real-world data and tested it on a pilot dataset of four TigerPlace residents monitored with a collection of motion, bed, and depth sensors over ten years. We used the TigerPlace electronic health records (EHR) to understand the residents' behavior patterns and to evaluate and explain the health warnings generated by our algorithm. The results obtained on the TigerPlace dataset show that most of the warnings produced by our algorithm can be linked to health events documented in the EHR, providing strong support for a prospective deployment of the approach.Includes bibliographical references

    Review on “Typicality-Based Collaborative Filtering Recommendation using Sub Clustering for Online Shopping”

    Get PDF
    Collaborative filtering is a convenient mechanism used in recommender system, which is used to find the similar items in a group. The same favour items can be identified by using the collaborative filtering based on items and the users. However there are some drawbacks in premature filtering techniques which lead to less accuracy, data sparsity and prediction errors. In this work take advantage of proposal of object typicality from cognitive psychology moreover suggests a typicality-based collaborative filtering recommendation method named as Tyco. A distinguishing characteristic of typicality-based collaborative filtering is that it finds neighbours of users on the basis of user typicality degrees in user groups. Selection of neighbours regarding users by means of measuring users’ similarity on the basis of their typicality degrees is a separate feature, which distinguishes this approach from earlier collaborative filtering methods. It exceeds many CF recommendation methods on recommendation accuracy on any type of datasets. In proposed method main approach is to Sub Clusters the all items into several item groups by applying such as nearest neighboring algorithm. This helps users to search items more easily and to increase the accuracy and quality of the recommendation

    Possibilistic clustering for shape recognition

    Get PDF
    Clustering methods have been used extensively in computer vision and pattern recognition. Fuzzy clustering has been shown to be advantageous over crisp (or traditional) clustering in that total commitment of a vector to a given class is not required at each iteration. Recently fuzzy clustering methods have shown spectacular ability to detect not only hypervolume clusters, but also clusters which are actually 'thin shells', i.e., curves and surfaces. Most analytic fuzzy clustering approaches are derived from Bezdek's Fuzzy C-Means (FCM) algorithm. The FCM uses the probabilistic constraint that the memberships of a data point across classes sum to one. This constraint was used to generate the membership update equations for an iterative algorithm. Unfortunately, the memberships resulting from FCM and its derivatives do not correspond to the intuitive concept of degree of belonging, and moreover, the algorithms have considerable trouble in noisy environments. Recently, we cast the clustering problem into the framework of possibility theory. Our approach was radically different from the existing clustering methods in that the resulting partition of the data can be interpreted as a possibilistic partition, and the membership values may be interpreted as degrees of possibility of the points belonging to the classes. We constructed an appropriate objective function whose minimum will characterize a good possibilistic partition of the data, and we derived the membership and prototype update equations from necessary conditions for minimization of our criterion function. In this paper, we show the ability of this approach to detect linear and quartic curves in the presence of considerable noise

    Fuzzy C-ordered medoids clustering of interval-valued data

    Get PDF
    Fuzzy clustering for interval-valued data helps us to find natural vague boundaries in such data. The Fuzzy c-Medoids Clustering (FcMdC) method is one of the most popular clustering methods based on a partitioning around medoids approach. However, one of the greatest disadvantages of this method is its sensitivity to the presence of outliers in data. This paper introduces a new robust fuzzy clustering method named Fuzzy c-Ordered-Medoids clustering for interval-valued data (FcOMdC-ID). The Huber's M-estimators and the Yager's Ordered Weighted Averaging (OWA) operators are used in the method proposed to make it robust to outliers. The described algorithm is compared with the fuzzy c-medoids method in the experiments performed on synthetic data with different types of outliers. A real application of the FcOMdC-ID is also provided
    corecore