521 research outputs found

    Supervised Learning with Indefinite Topological Kernels

    Full text link
    Topological Data Analysis (TDA) is a recent and growing branch of statistics devoted to the study of the shape of the data. In this work we investigate the predictive power of TDA in the context of supervised learning. Since topological summaries, most noticeably the Persistence Diagram, are typically defined in complex spaces, we adopt a kernel approach to translate them into more familiar vector spaces. We define a topological exponential kernel, we characterize it, and we show that, despite not being positive semi-definite, it can be successfully used in regression and classification tasks

    Geometric Inference on Kernel Density Estimates

    Get PDF
    We show that geometric inference of a point cloud can be calculated by examining its kernel density estimate with a Gaussian kernel. This allows one to consider kernel density estimates, which are robust to spatial noise, subsampling, and approximate computation in comparison to raw point sets. This is achieved by examining the sublevel sets of the kernel distance, which isomorphically map to superlevel sets of the kernel density estimate. We prove new properties about the kernel distance, demonstrating stability results and allowing it to inherit reconstruction results from recent advances in distance-based topological reconstruction. Moreover, we provide an algorithm to estimate its topology using weighted Vietoris-Rips complexes.Comment: To appear in SoCG 2015. 36 pages, 5 figure

    The persistence landscape and some of its properties

    Full text link
    Persistence landscapes map persistence diagrams into a function space, which may often be taken to be a Banach space or even a Hilbert space. In the latter case, it is a feature map and there is an associated kernel. The main advantage of this summary is that it allows one to apply tools from statistics and machine learning. Furthermore, the mapping from persistence diagrams to persistence landscapes is stable and invertible. We introduce a weighted version of the persistence landscape and define a one-parameter family of Poisson-weighted persistence landscape kernels that may be useful for learning. We also demonstrate some additional properties of the persistence landscape. First, the persistence landscape may be viewed as a tropical rational function. Second, in many cases it is possible to exactly reconstruct all of the component persistence diagrams from an average persistence landscape. It follows that the persistence landscape kernel is characteristic for certain generic empirical measures. Finally, the persistence landscape distance may be arbitrarily small compared to the interleaving distance.Comment: 18 pages, to appear in the Proceedings of the 2018 Abel Symposiu

    Interpretable statistics for complex modelling: quantile and topological learning

    Get PDF
    As the complexity of our data increased exponentially in the last decades, so has our need for interpretable features. This thesis revolves around two paradigms to approach this quest for insights. In the first part we focus on parametric models, where the problem of interpretability can be seen as a “parametrization selection”. We introduce a quantile-centric parametrization and we show the advantages of our proposal in the context of regression, where it allows to bridge the gap between classical generalized linear (mixed) models and increasingly popular quantile methods. The second part of the thesis, concerned with topological learning, tackles the problem from a non-parametric perspective. As topology can be thought of as a way of characterizing data in terms of their connectivity structure, it allows to represent complex and possibly high dimensional through few features, such as the number of connected components, loops and voids. We illustrate how the emerging branch of statistics devoted to recovering topological structures in the data, Topological Data Analysis, can be exploited both for exploratory and inferential purposes with a special emphasis on kernels that preserve the topological information in the data. Finally, we show with an application how these two approaches can borrow strength from one another in the identification and description of brain activity through fMRI data from the ABIDE project

    The Christoffel-Darboux kernel for topological data analysis

    Get PDF
    Persistent homology has been widely used to study the topology of point clouds in Rn\mathbb{R}^n. Standard approaches are very sensitive to outliers, and their computational complexity depends badly on the number of data points. In this paper we introduce a novel persistence module for a point cloud using the theory of Christoffel-Darboux kernels. This module is robust to (statistical) outliers in the data, and can be computed in time linear in the number of data points. We illustrate the benefits and limitations of our new module with various numerical examples in Rn\mathbb{R}^n, for n=1,2,3n=1, 2, 3. Our work expands upon recent applications of Christoffel-Darboux kernels in the context of statistical data analysis and geometric inference (Lasserre, Pauwels and Putinar, 2022). There, these kernels are used to construct a polynomial whose level sets capture the geometry of a point cloud in a precise sense. We show that the persistent homology associated to the sublevel set filtration of this polynomial is stable with respect to the Wasserstein distance. Moreover, we show that the persistent homology of this filtration can be computed in singly exponential time in the ambient dimension nn, using a recent algorithm of Basu & Karisani (2022).Comment: 22 pages, 11 figures, 1 tabl

    Kernel-based methods for persistent homology and their applications to Alzheimer's Disease

    Get PDF
    Kernel-based methods are powerful tools that are widely applied in many applications and fields of research. In recent years, methods from computational topology have emerged for characterizing the intrinsic geometry of data. Persistence homology is a central tool in topological data analysis, which allows to capture the evolution of topological features of the data. Persistence diagrams represent a natural way to summarize these features, but they can not be directly used in machine learning algorithms. To deal with them, we first analyse various kernel-based methods of recent development, then we propose and apply Variable Scaled Kernels (VSKs) to the persistence diagrams framework. We therefore discuss the application of these kernels in medical imaging in the context of Alzheimer’s Disease classification. Taking into account the cortical thickness measures on the cortical surface, we build the persistence diagrams upon different MRI subjects and we perform some classification tests using the support vector machines classifier.ope
    • …
    corecore