10,089 research outputs found

    Robust fuzzyclustering for object recognition and classification of relational data

    Get PDF
    Prototype based fuzzy clustering algorithms have unique ability to partition the data while detecting multiple clusters simultaneously. However since real data is often contaminated with noise, the clustering methods need to be made robust to be useful in practice. This dissertation focuses on robust detection of multiple clusters from noisy range images for object recognition. Dave\u27s noise clustering (NC) method has been shown to make prototype-based fuzzy clustering techniques robust. In this work, NC is generalized and the new NC membership is shown to be a product of fuzzy c-means (FCM) membership and robust M-estimator weight (or possibilistic membership). Thus the generalized NC approach is shown to have the partitioning ability of FCM and robustness of M-estimators. Since the NC (or FCM) algorithms are based on fixed-point iteration technique, they suffer from the problem of initializations. To overcome this problem, the sampling based robust LMS algorithm is considered by extending it to fuzzy c-LMS algorithm for detecting multiple clusters. The concept of repeated evidence has been incorporated to increase the speed of the new approach. The main problem with the LMS approach is the need for ordering the distance data. To eliminate this problem, a novel sampling based robust algorithm is proposed following the NC principle, called the NLS method, that directly searches for clusters in the maximum density region of the range data without requiring the specification of number of clusters. The NC concept is also introduced to several fuzzy methods for robust classification of relational data for pattern recognition. This is also extended to non-Euclidean relational data. The resulting algorithms are used for object recognition from range images as well as for identification of bottleneck parts while creating desegregated cells of machine/ components in cellular manufacturing and group technology (GT) applications

    Relational visual cluster validity

    Get PDF
    The assessment of cluster validity plays a very important role in cluster analysis. Most commonly used cluster validity methods are based on statistical hypothesis testing or finding the best clustering scheme by computing a number of different cluster validity indices. A number of visual methods of cluster validity have been produced to display directly the validity of clusters by mapping data into two- or three-dimensional space. However, these methods may lose too much information to correctly estimate the results of clustering algorithms. Although the visual cluster validity (VCV) method of Hathaway and Bezdek can successfully solve this problem, it can only be applied for object data, i.e. feature measurements. There are very few validity methods that can be used to analyze the validity of data where only a similarity or dissimilarity relation exists – relational data. To tackle this problem, this paper presents a relational visual cluster validity (RVCV) method to assess the validity of clustering relational data. This is done by combining the results of the non-Euclidean relational fuzzy c-means (NERFCM) algorithm with a modification of the VCV method to produce a visual representation of cluster validity. RVCV can cluster complete and incomplete relational data and adds to the visual cluster validity theory. Numeric examples using synthetic and real data are presente

    Evidential relational clustering using medoids

    Get PDF
    In real clustering applications, proximity data, in which only pairwise similarities or dissimilarities are known, is more general than object data, in which each pattern is described explicitly by a list of attributes. Medoid-based clustering algorithms, which assume the prototypes of classes are objects, are of great value for partitioning relational data sets. In this paper a new prototype-based clustering method, named Evidential C-Medoids (ECMdd), which is an extension of Fuzzy C-Medoids (FCMdd) on the theoretical framework of belief functions is proposed. In ECMdd, medoids are utilized as the prototypes to represent the detected classes, including specific classes and imprecise classes. Specific classes are for the data which are distinctly far from the prototypes of other classes, while imprecise classes accept the objects that may be close to the prototypes of more than one class. This soft decision mechanism could make the clustering results more cautious and reduce the misclassification rates. Experiments in synthetic and real data sets are used to illustrate the performance of ECMdd. The results show that ECMdd could capture well the uncertainty in the internal data structure. Moreover, it is more robust to the initializations compared with FCMdd.Comment: in The 18th International Conference on Information Fusion, July 2015, Washington, DC, USA , Jul 2015, Washington, United State

    Median evidential c-means algorithm and its application to community detection

    Get PDF
    Median clustering is of great value for partitioning relational data. In this paper, a new prototype-based clustering method, called Median Evidential C-Means (MECM), which is an extension of median c-means and median fuzzy c-means on the theoretical framework of belief functions is proposed. The median variant relaxes the restriction of a metric space embedding for the objects but constrains the prototypes to be in the original data set. Due to these properties, MECM could be applied to graph clustering problems. A community detection scheme for social networks based on MECM is investigated and the obtained credal partitions of graphs, which are more refined than crisp and fuzzy ones, enable us to have a better understanding of the graph structures. An initial prototype-selection scheme based on evidential semi-centrality is presented to avoid local premature convergence and an evidential modularity function is defined to choose the optimal number of communities. Finally, experiments in synthetic and real data sets illustrate the performance of MECM and show its difference to other methods

    Dealing with non-metric dissimilarities in fuzzy central clustering algorithms

    Get PDF
    Clustering is the problem of grouping objects on the basis of a similarity measure among them. Relational clustering methods can be employed when a feature-based representation of the objects is not available, and their description is given in terms of pairwise (dis)similarities. This paper focuses on the relational duals of fuzzy central clustering algorithms, and their application in situations when patterns are represented by means of non-metric pairwise dissimilarities. Symmetrization and shift operations have been proposed to transform the dissimilarities among patterns from non-metric to metric. In this paper, we analyze how four popular fuzzy central clustering algorithms are affected by such transformations. The main contributions include the lack of invariance to shift operations, as well as the invariance to symmetrization. Moreover, we highlight the connections between relational duals of central clustering algorithms and central clustering algorithms in kernel-induced spaces. One among the presented algorithms has never been proposed for non-metric relational clustering, and turns out to be very robust to shift operations. (C) 2008 Elsevier Inc. All rights reserved

    Clustering in relational data and ontologies

    Get PDF
    Title from PDF of title page (University of Missouri--Columbia, viewed on August 20, 2010).The entire thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file; a non-technical public abstract appears in the public.pdf file.Dissertation advisor: Dr. James M. Keller.Vita.Ph. D. University of Missouri--Columbia 2010.This dissertation studies the problem of clustering objects represented by relational data. This is a pertinent problem as many real-world data sets can only be represented by relational data for which object-based clustering algorithms are not designed. Relational data are encountered in many fields including biology, management, industrial engineering, and social sciences. Unlike numerical object data, which are represented by a set of feature values (e.g. height, weight, shoe size) of an object, relational object data are the numerical values of (dis) similarity between objects. For this reason, conventional cluster analysis methods such as k-means and fuzzy c-means cannot be used directly with relational data. I focus on three main problems of cluster analysis of relational data: (i) tendency prior to clustering -- how many clusters are there?; (ii) partitioning of objects -- which objects belong to which cluster?; and (iii) validity of the resultant clusters -- are the partitions \good"?Analyses are included in this dissertation that prove that the Visual Assessment of cluster Tendency (VAT) algorithm has a direct relation to single-linkage hierarchical clustering and Dunn's cluster validity index. These analyses are important to the development of two novel clustering algorithms, CLODD-CLustering in Ordered Dissimilarity Data and ReSL-Rectangular Single-Linkage clustering. Last, this dissertation addresses clustering in ontologies; examples include the Gene Ontology, the MeSH ontology, patient medical records, and web documents. I apply an extension to the Self-Organizing Map (SOM) to produce a new algorithm, the OSOM-Ontological Self-Organizing Map. OSOM provides visualization and linguistic summarization of ontology-based data.Includes bibliographical references

    Survey of data mining approaches to user modeling for adaptive hypermedia

    Get PDF
    The ability of an adaptive hypermedia system to create tailored environments depends mainly on the amount and accuracy of information stored in each user model. Some of the difficulties that user modeling faces are the amount of data available to create user models, the adequacy of the data, the noise within that data, and the necessity of capturing the imprecise nature of human behavior. Data mining and machine learning techniques have the ability to handle large amounts of data and to process uncertainty. These characteristics make these techniques suitable for automatic generation of user models that simulate human decision making. This paper surveys different data mining techniques that can be used to efficiently and accurately capture user behavior. The paper also presents guidelines that show which techniques may be used more efficiently according to the task implemented by the applicatio
    corecore