301,948 research outputs found
Statistical topological data analysis using persistence landscapes
We define a new topological summary for data that we call the persistence
landscape. Since this summary lies in a vector space, it is easy to combine
with tools from statistics and machine learning, in contrast to the standard
topological summaries. Viewed as a random variable with values in a Banach
space, this summary obeys a strong law of large numbers and a central limit
theorem. We show how a number of standard statistical tests can be used for
statistical inference using this summary. We also prove that this summary is
stable and that it can be used to provide lower bounds for the bottleneck and
Wasserstein distances.Comment: 26 pages, final version, to appear in Journal of Machine Learning
Research, includes two additional examples not in the journal version: random
geometric complexes and Erdos-Renyi random clique complexe
Persistent Homology and String Vacua
We use methods from topological data analysis to study the topological
features of certain distributions of string vacua. Topological data analysis is
a multi-scale approach used to analyze the topological features of a dataset by
identifying which homological characteristics persist over a long range of
scales. We apply these techniques in several contexts. We analyze N=2 vacua by
focusing on certain distributions of Calabi-Yau varieties and Landau-Ginzburg
models. We then turn to flux compactifications and discuss how we can use
topological data analysis to extract physical informations. Finally we apply
these techniques to certain phenomenologically realistic heterotic models. We
discuss the possibility of characterizing string vacua using the topological
properties of their distributions.Comment: 32 pages, 12 pdf figure
Persistence Bag-of-Words for Topological Data Analysis
Persistent homology (PH) is a rigorous mathematical theory that provides a
robust descriptor of data in the form of persistence diagrams (PDs). PDs
exhibit, however, complex structure and are difficult to integrate in today's
machine learning workflows. This paper introduces persistence bag-of-words: a
novel and stable vectorized representation of PDs that enables the seamless
integration with machine learning. Comprehensive experiments show that the new
representation achieves state-of-the-art performance and beyond in much less
time than alternative approaches.Comment: Accepted for the Twenty-Eight International Joint Conference on
Artificial Intelligence (IJCAI-19). arXiv admin note: substantial text
overlap with arXiv:1802.0485
Separating Topological Noise from Features Using Persistent Entropy
Topology is the branch of mathematics that studies shapes
and maps among them. From the algebraic definition of topology a new
set of algorithms have been derived. These algorithms are identified
with “computational topology” or often pointed out as Topological Data
Analysis (TDA) and are used for investigating high-dimensional data in a
quantitative manner. Persistent homology appears as a fundamental tool
in Topological Data Analysis. It studies the evolution of k−dimensional
holes along a sequence of simplicial complexes (i.e. a filtration). The set
of intervals representing birth and death times of k−dimensional holes
along such sequence is called the persistence barcode. k−dimensional
holes with short lifetimes are informally considered to be topological
noise, and those with a long lifetime are considered to be topological
feature associated to the given data (i.e. the filtration). In this paper, we
derive a simple method for separating topological noise from topological
features using a novel measure for comparing persistence barcodes called
persistent entropy.Ministerio de Economía y Competitividad MTM2015-67072-
- …
