100,804 research outputs found

    Methods of Hierarchical Clustering

    Get PDF
    We survey agglomerative hierarchical clustering algorithms and discuss efficient implementations that are available in R and other software environments. We look at hierarchical self-organizing maps, and mixture models. We review grid-based clustering, focusing on hierarchical density-based approaches. Finally we describe a recently developed very efficient (linear time) hierarchical clustering algorithm, which can also be viewed as a hierarchical grid-based algorithm.Comment: 21 pages, 2 figures, 1 table, 69 reference

    Probabilistic Self-Organizing Maps for Text-Independent Speaker Identification

    Get PDF
    The present paper introduces a novel speaker modeling technique for text-independent speaker identification using probabilistic self-organizing maps (PbSOMs). The basic motivation behind the introduced technique was to combine the self-organizing quality of the self-organizing maps and generative power of Gaussian mixture models. Experimental results show that the introduced modeling technique using probabilistic self-organizing maps significantly outperforms the traditional technique using the classical GMMs and the EM algorithm or its deterministic variant. More precisely, a relative accuracy improvement of roughly 39% has been gained, as well as, a much less sensitivity to the model-parameters initialization has been exhibited by using the introduced speaker modeling technique using probabilistic self-organizing maps

    Probabilistic Point Cloud Modeling via Self-Organizing Gaussian Mixture Models

    Full text link
    This letter presents a continuous probabilistic modeling methodology for spatial point cloud data using finite Gaussian Mixture Models (GMMs) where the number of components are adapted based on the scene complexity. Few hierarchical and adaptive methods have been proposed to address the challenge of balancing model fidelity with size. Instead, state-of-the-art mapping approaches require tuning parameters for specific use cases, but do not generalize across diverse environments. To address this gap, we utilize a self-organizing principle from information-theoretic learning to automatically adapt the complexity of the GMM model based on the relevant information in the sensor data. The approach is evaluated against existing point cloud modeling techniques on real-world data with varying degrees of scene complexity.Comment: 8 pages, 6 figures, to appear in IEEE Robotics and Automation Letter

    Incremental Multimodal Surface Mapping via Self-Organizing Gaussian Mixture Models

    Full text link
    This letter describes an incremental multimodal surface mapping methodology, which represents the environment as a continuous probabilistic model. This model enables high-resolution reconstruction while simultaneously compressing spatial and intensity point cloud data. The strategy employed in this work utilizes Gaussian mixture models (GMMs) to represent the environment. While prior GMM-based mapping works have developed methodologies to determine the number of mixture components using information-theoretic techniques, these approaches either operate on individual sensor observations, making them unsuitable for incremental mapping, or are not real-time viable, especially for applications where high-fidelity modeling is required. To bridge this gap, this letter introduces a spatial hash map for rapid GMM submap extraction combined with an approach to determine relevant and redundant data in a point cloud. These contributions increase computational speed by an order of magnitude compared to state-of-the-art incremental GMM-based mapping. In addition, the proposed approach yields a superior tradeoff in map accuracy and size when compared to state-of-the-art mapping methodologies (both GMM- and not GMM-based). Evaluations are conducted using both simulated and real-world data. The software is released open-source to benefit the robotics community.Comment: 7 pages, 7 figures, under review at IEEE Robotics and Automation Letter

    Algorithms for Hierarchical Clustering: An Overview, II

    Get PDF
    We survey agglomerative hierarchical clustering algorithms and discuss efficient implementations that are available in R and other software environments. We look at hierarchical self-organizing maps, and mixture models. We review grid-based clustering, focusing on hierarchical density-based approaches. Finally we describe a recently developed very efficient (linear time) hierarchical clustering algorithm, which can also be viewed as a hierarchical grid-based algorithm. This review adds to the earlier version, Murtagh and Contreras (2012)

    Description of Input Patterns by Linear Mixtures of SOM Models

    Get PDF
    This paper introduces a novel way of analyzing input patterns presented to the Self-Organizing Map (SOM). Instead of identifying only the "winner," i.e., the model that matches best with the input, we determine the linear mixture of the models (reference vectors) of the SOM that approximates to the input vector best. It will be shown that if only nonnegative weights are allowed in this linear mixture, the expansion of the input pattern in terms of the models is very meaningful, contains only few terms, and provides a better insight into the input state than what the mere "winner" can give. If then the models fall into classes that are known a priori, the sums of the weights over each class can be interpreted as expressing the affiliation of the input with the due classes

    Missing data imputation through generative topographic mapping as a mixture of t-distributions: Theoretical developments

    Get PDF
    The Generative Topographic Mapping (GTM) was originally conceived as a probabilistic alternative to the well-known, neural network-inspired, Self-Organizing Map (SOM). The GTM can also be interpreted as a constrained mixture of distributions model. In recent years, much attention has been directed towards Student t-distributions as an alternative to Gaussians in mixture models due to their robustness towards outliers. In this report, the GTM is redefined as a constrained mixture of t-distributions: the t-GTM, and the Expectation-Maximization algorithm that is used to fit the model to the data is modified to provide missing data imputation.Postprint (published version
    • …
    corecore