847 research outputs found
Sliced Wasserstein Kernel for Persistence Diagrams
Persistence diagrams (PDs) play a key role in topological data analysis
(TDA), in which they are routinely used to describe topological properties of
complicated shapes. PDs enjoy strong stability properties and have proven their
utility in various learning contexts. They do not, however, live in a space
naturally endowed with a Hilbert structure and are usually compared with
specific distances, such as the bottleneck distance. To incorporate PDs in a
learning pipeline, several kernels have been proposed for PDs with a strong
emphasis on the stability of the RKHS distance w.r.t. perturbations of the PDs.
In this article, we use the Sliced Wasserstein approximation SW of the
Wasserstein distance to define a new kernel for PDs, which is not only provably
stable but also provably discriminative (depending on the number of points in
the PDs) w.r.t. the Wasserstein distance between PDs. We also demonstrate
its practicality, by developing an approximation technique to reduce kernel
computation time, and show that our proposal compares favorably to existing
kernels for PDs on several benchmarks.Comment: Minor modification
Statistical Computational Topology and Geometry for Understanding Data
Here we describe three projects involving data analysis which focus on engaging statistics with the geometry and/or topology of the data.
The first project involves the development and implementation of kernel density estimation for persistence diagrams. These kernel densities consider neighborhoods for every feature in the center diagram and gives to each feature an independent, orthogonal direction. The creation of kernel densities in this realm yields a (previously unavailable) full characterization of the (random) geometry of a dataspace or data distribution.
In the second project, cohomology is used to guide a search for kidney exchange cycles within a kidney paired donation pool. The same technique also produces a score function that helps to predict a patient-donor pair\u27s a priori advantage within a donation pool. The resulting allocation of cycles is determined to be equitable according to a strict analysis of the allocation distribution.
In the last project, a previously formulated metric between surfaces called continuous Procrustes distance (CPD) is applied to species discrimination in fossils. This project involves both the application and a rigorous comparison of the metric with its primary competitor, discrete Procrustes distance. Besides comparing the separation power of discrete and continuous Procrustes distances, the effect of surface resolution on CPD is investigated in this study
- …