Search CORE

1,020 research outputs found

On the Bootstrap for Persistence Diagrams and Landscapes

Author: Chazal Frédéric
Fasy Brittany Terese
Lecci Fabrizio
Rinaldo Alessandro
Singh Aarti
Wasserman Larry
Publication venue
Publication date: 02/11/2013
Field of study

Persistent homology probes topological properties from point clouds and functions. By looking at multiple scales simultaneously, one can record the births and deaths of topological features as the scale varies. In this paper we use a statistical technique, the empirical bootstrap, to separate topological signal from topological noise. In particular, we derive confidence sets for persistence diagrams and confidence bands for persistence landscapes

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Stochastic Convergence of Persistence Landscapes and Silhouettes

Author: Chazal Frédéric
Fasy Brittany Terese
Lecci Fabrizio
Rinaldo Alessandro
Wasserman Larry
Publication venue
Publication date: 01/12/2013
Field of study

Persistent homology is a widely used tool in Topological Data Analysis that encodes multiscale topological information as a multi-set of points in the plane called a persistence diagram. It is difficult to apply statistical theory directly to a random sample of diagrams. Instead, we can summarize the persistent homology with the persistence landscape, introduced by Bubenik, which converts a diagram into a well-behaved real-valued function. We investigate the statistical properties of landscapes, such as weak convergence of the average landscapes and convergence of the bootstrap. In addition, we introduce an alternate functional summary of persistent homology, which we call the silhouette, and derive an analogous statistical theory

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Directory of Open Access Journals

Journal of Computational Geometry (JoCG - Carleton University, Computational Geometry Lab)

Introduction to the R package TDA

Author: Fasy Brittany Terese
Kim Jisu
Lecci Fabrizio
Maria Clément
Publication venue
Publication date: 29/01/2015
Field of study

We present a short tutorial and introduction to using the R package TDA, which provides some tools for Topological Data Analysis. In particular, it includes implementations of functions that, given some data, provide topological information about the underlying space, such as the distance function, the distance to a measure, the kNN density estimator, the kernel density estimator, and the kernel distance. The salient topological features of the sublevel sets (or superlevel sets) of these functions can be quantified with persistent homology. We provide an R interface for the efficient algorithms of the C++ libraries GUDHI, Dionysus and PHAT, including a function for the persistent homology of the Rips filtration, and one for the persistent homology of sublevel sets (or superlevel sets) of arbitrary functions evaluated over a grid of points. The significance of the features in the resulting persistence diagrams can be analyzed with functions that implement recently developed statistical methods. The R package TDA also includes the implementation of an algorithm for density clustering, which allows us to identify the spatial organization of the probability mass associated to a density function and visualize it by means of a dendrogram, the cluster tree

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

The persistence landscape and some of its properties

Author: A Adcock
A Collins
A Robinson
A Zomorodian
AJ Blumberg
B Bollobás
B Di Fabio
D Cohen-Steiner
E Munch
F Chazal
F Chazal
H Adams
H Edelsbrunner
I Donato
K Turner
K Turner
M Carrière
M Gameiro
M Gidea
M Lesnick
P Bendich
P Bubenik
P Bubenik
P Bubenik
P Bubenik
P Dłotko
RJ Adler
S Kališnik
V Kovacev-Nikolic
V Patrangenaru
V Robins
Y Lee
Y Wang
Y Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/01/2019
Field of study

Persistence landscapes map persistence diagrams into a function space, which may often be taken to be a Banach space or even a Hilbert space. In the latter case, it is a feature map and there is an associated kernel. The main advantage of this summary is that it allows one to apply tools from statistics and machine learning. Furthermore, the mapping from persistence diagrams to persistence landscapes is stable and invertible. We introduce a weighted version of the persistence landscape and define a one-parameter family of Poisson-weighted persistence landscape kernels that may be useful for learning. We also demonstrate some additional properties of the persistence landscape. First, the persistence landscape may be viewed as a tropical rational function. Second, in many cases it is possible to exactly reconstruct all of the component persistence diagrams from an average persistence landscape. It follows that the persistence landscape kernel is characteristic for certain generic empirical measures. Finally, the persistence landscape distance may be arbitrarily small compared to the interleaving distance.Comment: 18 pages, to appear in the Proceedings of the 2018 Abel Symposiu

arXiv.org e-Print Archive

Crossref

Multiple testing with persistent homology

Author: Mukherjee Sayan
Vejdemo-Johansson Mikael
Publication venue
Publication date: 06/11/2019
Field of study

Multiple hypothesis testing requires a control procedure. Simply increasing simulations or permutations to meet a Bonferroni-style threshold is prohibitively expensive. In this paper we propose a null model based approach to testing for acyclicity, coupled with a Family-Wise Error Rate (FWER) control method that does not suffer from these computational costs. We adapt an False Discovery Rate (FDR) control approach to the topological setting, and show it to be compatible both with our null model approach and with previous approaches to hypothesis testing in persistent homology. By extending a limit theorem for persistent homology on samples from point processes, we provide theoretical validation for our FWER and FDR control methods

arXiv.org e-Print Archive

Statistical topological data analysis using persistence landscapes

Author: Bubenik Peter
Publication venue
Publication date: 23/01/2015
Field of study

We define a new topological summary for data that we call the persistence landscape. Since this summary lies in a vector space, it is easy to combine with tools from statistics and machine learning, in contrast to the standard topological summaries. Viewed as a random variable with values in a Banach space, this summary obeys a strong law of large numbers and a central limit theorem. We show how a number of standard statistical tests can be used for statistical inference using this summary. We also prove that this summary is stable and that it can be used to provide lower bounds for the bottleneck and Wasserstein distances.Comment: 26 pages, final version, to appear in Journal of Machine Learning Research, includes two additional examples not in the journal version: random geometric complexes and Erdos-Renyi random clique complexe

arXiv.org e-Print Archive

CiteSeerX