9,964 research outputs found

    Exploratory Analysis of Functional Data via Clustering and Optimal Segmentation

    Full text link
    We propose in this paper an exploratory analysis algorithm for functional data. The method partitions a set of functions into KK clusters and represents each cluster by a simple prototype (e.g., piecewise constant). The total number of segments in the prototypes, PP, is chosen by the user and optimally distributed among the clusters via two dynamic programming algorithms. The practical relevance of the method is shown on two real world datasets

    Data-Driven Shape Analysis and Processing

    Full text link
    Data-driven methods play an increasingly important role in discovering geometric, structural, and semantic relationships between 3D shapes in collections, and applying this analysis to support intelligent modeling, editing, and visualization of geometric data. In contrast to traditional approaches, a key feature of data-driven approaches is that they aggregate information from a collection of shapes to improve the analysis and processing of individual shapes. In addition, they are able to learn models that reason about properties and relationships of shapes without relying on hard-coded rules or explicitly programmed instructions. We provide an overview of the main concepts and components of these techniques, and discuss their application to shape classification, segmentation, matching, reconstruction, modeling and exploration, as well as scene analysis and synthesis, through reviewing the literature and relating the existing works with both qualitative and numerical comparisons. We conclude our report with ideas that can inspire future research in data-driven shape analysis and processing.Comment: 10 pages, 19 figure

    Nonparametric Hierarchical Clustering of Functional Data

    Full text link
    In this paper, we deal with the problem of curves clustering. We propose a nonparametric method which partitions the curves into clusters and discretizes the dimensions of the curve points into intervals. The cross-product of these partitions forms a data-grid which is obtained using a Bayesian model selection approach while making no assumptions regarding the curves. Finally, a post-processing technique, aiming at reducing the number of clusters in order to improve the interpretability of the clustering, is proposed. It consists in optimally merging the clusters step by step, which corresponds to an agglomerative hierarchical classification whose dissimilarity measure is the variation of the criterion. Interestingly this measure is none other than the sum of the Kullback-Leibler divergences between clusters distributions before and after the merges. The practical interest of the approach for functional data exploratory analysis is presented and compared with an alternative approach on an artificial and a real world data set

    Neural Networks for Complex Data

    Full text link
    Artificial neural networks are simple and efficient machine learning tools. Defined originally in the traditional setting of simple vector data, neural network models have evolved to address more and more difficulties of complex real world problems, ranging from time evolving data to sophisticated data structures such as graphs and functions. This paper summarizes advances on those themes from the last decade, with a focus on results obtained by members of the SAMM team of Universit\'e Paris

    Recent advances in directional statistics

    Get PDF
    Mainstream statistical methodology is generally applicable to data observed in Euclidean space. There are, however, numerous contexts of considerable scientific interest in which the natural supports for the data under consideration are Riemannian manifolds like the unit circle, torus, sphere and their extensions. Typically, such data can be represented using one or more directions, and directional statistics is the branch of statistics that deals with their analysis. In this paper we provide a review of the many recent developments in the field since the publication of Mardia and Jupp (1999), still the most comprehensive text on directional statistics. Many of those developments have been stimulated by interesting applications in fields as diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics, image analysis, text mining, environmetrics, and machine learning. We begin by considering developments for the exploratory analysis of directional data before progressing to distributional models, general approaches to inference, hypothesis testing, regression, nonparametric curve estimation, methods for dimension reduction, classification and clustering, and the modelling of time series, spatial and spatio-temporal data. An overview of currently available software for analysing directional data is also provided, and potential future developments discussed.Comment: 61 page

    What are the true clusters?

    Get PDF
    Constructivist philosophy and Hasok Chang's active scientific realism are used to argue that the idea of "truth" in cluster analysis depends on the context and the clustering aims. Different characteristics of clusterings are required in different situations. Researchers should be explicit about on what requirements and what idea of "true clusters" their research is based, because clustering becomes scientific not through uniqueness but through transparent and open communication. The idea of "natural kinds" is a human construct, but it highlights the human experience that the reality outside the observer's control seems to make certain distinctions between categories inevitable. Various desirable characteristics of clusterings and various approaches to define a context-dependent truth are listed, and I discuss what impact these ideas can have on the comparison of clustering methods, and the choice of a clustering methods and related decisions in practice
    • 

    corecore