15 research outputs found

    A kernel for multi-parameter persistent homology

    Get PDF
    Topological data analysis and its main method, persistent homology, provide a toolkit for computing topological information of high-dimensional and noisy data sets. Kernels for one-parameter persistent homology have been established to connect persistent homology with machine learning techniques with applicability on shape analysis, recognition, and classification. We contribute a kernel construction for multi-parameter persistence by integrating a one-parameter kernel weighted along straight lines. We prove that our kernel is stable and efficiently computable, which establishes a theoretical connection between topological data analysis and machine learning for multivariate data analysis

    Delaunay Bifiltrations of Functions on Point Clouds

    Full text link
    The Delaunay filtration D(X)\mathcal{D}_{\bullet}(X) of a point cloud XRdX\subset \mathbb{R}^d is a central tool of computational topology. Its use is justified by the topological equivalence of D(X)\mathcal{D}_{\bullet}(X) and the offset (i.e., union-of-balls) filtration of XX. Given a function γ:XR\gamma: X \to \mathbb{R}, we introduce a Delaunay bifiltration DC(γ)\mathcal{DC}_{\bullet}(\gamma) that satisfies an analogous topological equivalence, ensuring that DC(γ)\mathcal{DC}_{\bullet}(\gamma) topologically encodes the offset filtrations of all sublevel sets of γ\gamma, as well as the topological relations between them. DC(γ)\mathcal{DC}_{\bullet}(\gamma) is of size O(Xd+12)O(|X|^{\lceil\frac{d+1}{2}\rceil}), which for dd odd matches the worst-case size of D(X)\mathcal{D}_{\bullet}(X). Adapting the Bowyer-Watson algorithm for computing Delaunay triangulations, we give a simple, practical algorithm to compute DC(γ)\mathcal{DC}_{\bullet}(\gamma) in time O(Xd2+1)O(|X|^{\lceil \frac{d}{2}\rceil +1}). Our implementation, based on CGAL, computes DC(γ)\mathcal{DC}_{\bullet}(\gamma) with modest overhead compared to computing D(X)\mathcal{D}_{\bullet}(X), and handles tens of thousands of points in R3\mathbb{R}^3 within seconds.Comment: 28 pages, 7 figures, 8 tables. To appear in the proceedings of SODA2

    Multiparameter Persistence Images for Topological Machine Learning

    Get PDF
    International audienceIn the last decade, there has been increasing interest in topological data analysis, a new methodology for using geometric structures in data for inference and learning. A central theme in the area is the idea of persistence, which in its most basic form studies how measures of shape change as a scale parameter varies. There are now a number of frameworks that support statistics and machine learning in this context. However, in many applications there are several different parameters one might wish to vary: for example, scale and density. In contrast to the one-parameter setting, techniques for applying statistics and machine learning in the setting of multiparameter persistence are not well understood due to the lack of a concise representation of the results. We introduce a new descriptor for multiparameter persistence, which we call the Multiparameter Persistence Image, that is suitable for machine learning and statistical frameworks, is robust to perturbations in the data, has finer resolution than existing descriptors based on slicing, and can be efficiently computed on data sets of realistic size. Moreover, we demonstrate its efficacy by comparing its performance to other multiparameter descriptors on several classification tasks

    Fast Minimal Presentations of Bi-graded Persistence Modules

    Full text link
    Multi-parameter persistent homology is a recent branch of topological data analysis. In this area, data sets are investigated through the lens of homology with respect to two or more scale parameters. The high computational cost of many algorithms calls for a preprocessing step to reduce the input size. In general, a minimal presentation is the smallest possible representation of a persistence module. Lesnick and Wright proposed recently an algorithm (the LW-algorithm) for computing minimal presentations based on matrix reduction. In this work, we propose, implement and benchmark several improvements over the LW-algorithm. Most notably, we propose the use of priority queues to avoid extensive scanning of the matrix columns, which constitutes the computational bottleneck in the LW-algorithm, and we combine their algorithm with ideas from the multi-parameter chunk algorithm by Fugacci and Kerber. Our extensive experiments show that our algorithm outperforms the LW-algorithm and computes the minimal presentation for data sets with millions of simplices within a few seconds. Our software is publicly available.Comment: This is an extended version of a paper that will appear at ALENEX 202

    Approximating Continuous Functions on Persistence Diagrams Using Template Functions

    Full text link
    The persistence diagram is an increasingly useful tool from Topological Data Analysis, but its use alongside typical machine learning techniques requires mathematical finesse. The most success to date has come from methods that map persistence diagrams into Rn\mathbb{R}^n, in a way which maximizes the structure preserved. This process is commonly referred to as featurization. In this paper, we describe a mathematical framework for featurization using template functions. These functions are general as they are only required to be continuous and compactly supported. We discuss two realizations: tent functions, which emphasize the local contributions of points in a persistence diagram, and interpolating polynomials, which capture global pairwise interactions. We combine the resulting features with classification and regression algorithms on several examples including shape data and the Rossler system. Our results show that using template functions yields high accuracy rates that match and often exceed those of existing featurization methods. One counter-intuitive observation is that in most cases using interpolating polynomials, where each point contributes globally to the feature vector, yields significantly better results than using tent functions, where the contribution of each point is localized. Along the way, we provide a complete characterization of compactness in the space of persistence diagrams
    corecore