Search CORE

78 research outputs found

Persistence-Based Clustering in Riemannian Manifolds

Author: Chazal Frédéric
Guibas Leonidas J.
Oudot Steve,
Skraba Primoz
Publication venue: HAL CCSD
Publication date: 01/01/2009
Field of study

We present a novel clustering algorithm that combines a mode-seeking phase with a cluster merging phase. While mode detection is performed by a standard graph-based hill-climbing scheme, the novelty of our approach resides in its use of {\em topological persistence} theory to guide the merges between clusters. An interesting feature of our algorithm is to provide additional feedback in the form of a finite set of points in the plane, called a {\em persistence diagram}, which provably reflects the prominence of each of the modes of the density. Such feedback is an invaluable tool in practice, as it enables the user to determine a set of parameter values that will make the algorithm compute a relevant clustering on the next run. In terms of generality, our approach requires the sole knowledge of (approximate) pairwise distances between the data points, as well as of rough estimates of the density at these points. It is therefore virtually applicable in any arbitrary metric space. In the meantime, its complexity remains reasonable: although the size of the input distance matrix may be up to quadratic in the number of data points, a careful implementation only uses a linear amount of main memory and barely takes more time to run than the one spent reading the input. Taking advantage of recent advances in topological persistence theory, we are able to give a theoretically sound notion of what the {\em correct} number

k

of clusters is, and to prove that under mild sampling conditions and a relevant choice of parameters (made possible in practice by the persistence diagram) our clustering scheme computes a set of

k

clusters whose spatial locations are bound to the ones of the basins of attraction of the peaks of the density. These guarantess hold in a large variety of contexts, including when data points are distributed along some unknown Riemannian manifold

CiteSeerX

INRIA a CCSD electronic archive server

HAL-Rennes 1

A Stable Multi-Scale Kernel for Topological Machine Learning

Author: Bauer Ulrich
Huber Stefan
Kwitt Roland
Reininghaus Jan
Publication venue
Publication date: 21/12/2014
Field of study

Topological data analysis offers a rich source of valuable information to study vision problems. Yet, so far we lack a theoretically sound connection to popular kernel-based learning techniques, such as kernel SVMs or kernel PCA. In this work, we establish such a connection by designing a multi-scale kernel for persistence diagrams, a stable summary representation of topological features in data. We show that this kernel is positive definite and prove its stability with respect to the 1-Wasserstein distance. Experiments on two benchmark datasets for 3D shape classification/retrieval and texture recognition show considerable performance gains of the proposed method compared to an alternative approach that is based on the recently introduced persistence landscapes

arXiv.org e-Print Archive

CiteSeerX

Crossref

IST Austria: PubRep (Institute of Science and Technology)

Supervised Learning with Indefinite Topological Kernels

Author: Brutti Pierpaolo
Padellini Tullia
Publication venue
Publication date: 01/01/2017
Field of study

Topological Data Analysis (TDA) is a recent and growing branch of statistics devoted to the study of the shape of the data. In this work we investigate the predictive power of TDA in the context of supervised learning. Since topological summaries, most noticeably the Persistence Diagram, are typically defined in complex spaces, we adopt a kernel approach to translate them into more familiar vector spaces. We define a topological exponential kernel, we characterize it, and we show that, despite not being positive semi-definite, it can be successfully used in regression and classification tasks

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Exact Computation of a Manifold Metric, via Lipschitz Embeddings and Shortest Paths on a Graph

Author: Chu Timothy
Miller Gary
Sheehy Donald
Publication venue
Publication date: 21/04/2020
Field of study

Data-sensitive metrics adapt distances locally based the density of data points with the goal of aligning distances and some notion of similarity. In this paper, we give the first exact algorithm for computing a data-sensitive metric called the nearest neighbor metric. In fact, we prove the surprising result that a previously published

3

-approximation is an exact algorithm. The nearest neighbor metric can be viewed as a special case of a density-based distance used in machine learning, or it can be seen as an example of a manifold metric. Previous computational research on such metrics despaired of computing exact distances on account of the apparent difficulty of minimizing over all continuous paths between a pair of points. We leverage the exact computation of the nearest neighbor metric to compute sparse spanners and persistent homology. We also explore the behavior of the metric built from point sets drawn from an underlying distribution and consider the more general case of inputs that are finite collections of path-connected compact sets. The main results connect several classical theories such as the conformal change of Riemannian metrics, the theory of positive definite functions of Schoenberg, and screw function theory of Schoenberg and Von Neumann. We develop novel proof techniques based on the combination of screw functions and Lipschitz extensions that may be of independent interest.Comment: 15 page

arXiv.org e-Print Archive

Crossref