847 research outputs found

    Sliced Wasserstein Kernel for Persistence Diagrams

    Get PDF
    Persistence diagrams (PDs) play a key role in topological data analysis (TDA), in which they are routinely used to describe topological properties of complicated shapes. PDs enjoy strong stability properties and have proven their utility in various learning contexts. They do not, however, live in a space naturally endowed with a Hilbert structure and are usually compared with specific distances, such as the bottleneck distance. To incorporate PDs in a learning pipeline, several kernels have been proposed for PDs with a strong emphasis on the stability of the RKHS distance w.r.t. perturbations of the PDs. In this article, we use the Sliced Wasserstein approximation SW of the Wasserstein distance to define a new kernel for PDs, which is not only provably stable but also provably discriminative (depending on the number of points in the PDs) w.r.t. the Wasserstein distance d1d_1 between PDs. We also demonstrate its practicality, by developing an approximation technique to reduce kernel computation time, and show that our proposal compares favorably to existing kernels for PDs on several benchmarks.Comment: Minor modification

    Statistical Computational Topology and Geometry for Understanding Data

    Get PDF
    Here we describe three projects involving data analysis which focus on engaging statistics with the geometry and/or topology of the data. The first project involves the development and implementation of kernel density estimation for persistence diagrams. These kernel densities consider neighborhoods for every feature in the center diagram and gives to each feature an independent, orthogonal direction. The creation of kernel densities in this realm yields a (previously unavailable) full characterization of the (random) geometry of a dataspace or data distribution. In the second project, cohomology is used to guide a search for kidney exchange cycles within a kidney paired donation pool. The same technique also produces a score function that helps to predict a patient-donor pair\u27s a priori advantage within a donation pool. The resulting allocation of cycles is determined to be equitable according to a strict analysis of the allocation distribution. In the last project, a previously formulated metric between surfaces called continuous Procrustes distance (CPD) is applied to species discrimination in fossils. This project involves both the application and a rigorous comparison of the metric with its primary competitor, discrete Procrustes distance. Besides comparing the separation power of discrete and continuous Procrustes distances, the effect of surface resolution on CPD is investigated in this study
    • …
    corecore