301,948 research outputs found

    Statistical topological data analysis using persistence landscapes

    Full text link
    We define a new topological summary for data that we call the persistence landscape. Since this summary lies in a vector space, it is easy to combine with tools from statistics and machine learning, in contrast to the standard topological summaries. Viewed as a random variable with values in a Banach space, this summary obeys a strong law of large numbers and a central limit theorem. We show how a number of standard statistical tests can be used for statistical inference using this summary. We also prove that this summary is stable and that it can be used to provide lower bounds for the bottleneck and Wasserstein distances.Comment: 26 pages, final version, to appear in Journal of Machine Learning Research, includes two additional examples not in the journal version: random geometric complexes and Erdos-Renyi random clique complexe

    Persistent Homology and String Vacua

    Get PDF
    We use methods from topological data analysis to study the topological features of certain distributions of string vacua. Topological data analysis is a multi-scale approach used to analyze the topological features of a dataset by identifying which homological characteristics persist over a long range of scales. We apply these techniques in several contexts. We analyze N=2 vacua by focusing on certain distributions of Calabi-Yau varieties and Landau-Ginzburg models. We then turn to flux compactifications and discuss how we can use topological data analysis to extract physical informations. Finally we apply these techniques to certain phenomenologically realistic heterotic models. We discuss the possibility of characterizing string vacua using the topological properties of their distributions.Comment: 32 pages, 12 pdf figure

    Persistence Bag-of-Words for Topological Data Analysis

    Full text link
    Persistent homology (PH) is a rigorous mathematical theory that provides a robust descriptor of data in the form of persistence diagrams (PDs). PDs exhibit, however, complex structure and are difficult to integrate in today's machine learning workflows. This paper introduces persistence bag-of-words: a novel and stable vectorized representation of PDs that enables the seamless integration with machine learning. Comprehensive experiments show that the new representation achieves state-of-the-art performance and beyond in much less time than alternative approaches.Comment: Accepted for the Twenty-Eight International Joint Conference on Artificial Intelligence (IJCAI-19). arXiv admin note: substantial text overlap with arXiv:1802.0485

    Separating Topological Noise from Features Using Persistent Entropy

    Get PDF
    Topology is the branch of mathematics that studies shapes and maps among them. From the algebraic definition of topology a new set of algorithms have been derived. These algorithms are identified with “computational topology” or often pointed out as Topological Data Analysis (TDA) and are used for investigating high-dimensional data in a quantitative manner. Persistent homology appears as a fundamental tool in Topological Data Analysis. It studies the evolution of k−dimensional holes along a sequence of simplicial complexes (i.e. a filtration). The set of intervals representing birth and death times of k−dimensional holes along such sequence is called the persistence barcode. k−dimensional holes with short lifetimes are informally considered to be topological noise, and those with a long lifetime are considered to be topological feature associated to the given data (i.e. the filtration). In this paper, we derive a simple method for separating topological noise from topological features using a novel measure for comparing persistence barcodes called persistent entropy.Ministerio de Economía y Competitividad MTM2015-67072-
    corecore