Search CORE

1,260 research outputs found

Rates of convergence for robust geometric inference

Author: Chazal Frédéric
Massart Pascal
Michel Bertrand
Publication venue
Publication date: 29/03/2015
Field of study

Distances to compact sets are widely used in the field of Topological Data Analysis for inferring geometric and topological features from point clouds. In this context, the distance to a probability measure (DTM) has been introduced by Chazal et al. (2011) as a robust alternative to the distance a compact set. In practice, the DTM can be estimated by its empirical counterpart, that is the distance to the empirical measure (DTEM). In this paper we give a tight control of the deviation of the DTEM. Our analysis relies on a local analysis of empirical processes. In particular, we show that the rates of convergence of the DTEM directly depends on the regularity at zero of a particular quantile fonction which contains some local information about the geometry of the support. This quantile function is the relevant quantity to describe precisely how difficult is a geometric inference problem. Several numerical experiments illustrate the convergence of the DTEM and also confirm that our bounds are tight

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

DTM-based Filtrations

Author: Anai Hirokazu
Chazal Frédéric
Glisse Marc
Ike Yuichi
Inakoshi Hiroya
Tinarrage Raphaël
Umeda Yuhei
Publication venue
Publication date: 26/05/2020
Field of study

Despite strong stability properties, the persistent homology of filtrations classically used in Topological Data Analysis, such as, e.g. the Cech or Vietoris-Rips filtrations, are very sensitive to the presence of outliers in the data from which they are computed. In this paper, we introduce and study a new family of filtrations, the DTM-filtrations, built on top of point clouds in the Euclidean space which are more robust to noise and outliers. The approach adopted in this work relies on the notion of distance-to-measure functions, and extends some previous work on the approximation of such functions.Comment: Abel Symposia, Springer, In press, Topological Data Analysi

arXiv.org e-Print Archive

A Statistical Approach to Topological Data Analysis

Author: Michel Bertrand
Publication venue: HAL CCSD
Publication date: 24/11/2015
Field of study

Until very recently, topological data analysis and topological inference methods mostlyrelied on deterministic approaches. The major part of this habilitation thesis presents astatistical approach to such topological methods. We first develop model selection toolsfor selecting simplicial complexes in a given filtration. Next, we study the estimationof persistent homology on metric spaces. We also study a robust version of topologicaldata analysis. Related to this last topic, we also investigate the problem of Wassersteindeconvolution. The second part of the habilitation thesis gathers our contributions inother fields of statistics, including a model selection method for Gaussian mixtures, animplementation of the slope heuristic for calibrating penalties, and a study of Breiman’spermutation importance measure in the context of random forests

Thèses en Ligne

INRIA a CCSD electronic archive server

A Unifying Model of Genome Evolution Under Parsimony

Author: A Bergeron
A Caprara
AE Darling
AW Xu
B Paten
B Paten
B Paten
B Raphael
Benedict Paten
C Chauve
D Bienstock
Daniel R Zerbino
David Haussler
E Tannier
G Bourque
Glenn Hickey
I Elias
J Edmonds
J Felsenstein
J Kim
J Ma
L Chindelevitch
LL Wang
M Alekseyev
M Bader
M Blanchette
M Shao
MD Braga
N El-Mabrouk
N El-Mabrouk
O Westesson
P Medvedev
S Hannenhalli
S Yancopoulos
S Yancopoulos
W Day
W Miller
YS Song
Publication venue
Publication date: 12/05/2014
Field of study

We present a data structure called a history graph that offers a practical basis for the analysis of genome evolution. It conceptually simplifies the study of parsimonious evolutionary histories by representing both substitutions and double cut and join (DCJ) rearrangements in the presence of duplications. The problem of constructing parsimonious history graphs thus subsumes related maximum parsimony problems in the fields of phylogenetic reconstruction and genome rearrangement. We show that tractable functions can be used to define upper and lower bounds on the minimum number of substitutions and DCJ rearrangements needed to explain any history graph. These bounds become tight for a special type of unambiguous history graph called an ancestral variation graph (AVG), which constrains in its combinatorial structure the number of operations required. We finally demonstrate that for a given history graph

G

, a finite set of AVGs describe all parsimonious interpretations of

G

, and this set can be explored with a few sampling moves.Comment: 52 pages, 24 figure

arXiv.org e-Print Archive

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Tight bounds for the learning of homotopy \`a la Niyogi, Smale, and Weinberger for subsets of Euclidean spaces and of Riemannian manifolds

Author: Attali Dominique
Fillmore Christopher
Ghosh Ishika
Kouřimská Hana Dal Poz
Lieutier André
Stephenson Elizabeth
Wintraecken Mathijs
Publication venue
Publication date: 14/09/2023
Field of study

In this article we extend and strengthen the seminal work by Niyogi, Smale, and Weinberger on the learning of the homotopy type from a sample of an underlying space. In their work, Niyogi, Smale, and Weinberger studied samples of

C^2

manifolds with positive reach embedded in

\mathbb{R}^d

. We extend their results in the following ways: In the first part of our paper we consider both manifolds of positive reach -- a more general setting than

C^2

manifolds -- and sets of positive reach embedded in

\mathbb{R}^d

. The sample

P

of such a set

\mathcal{S}

does not have to lie directly on it. Instead, we assume that the two one-sided Hausdorff distances --

\varepsilon

and

\delta

-- between

P

and

\mathcal{S}

are bounded. We provide explicit bounds in terms of

\varepsilon

and

\delta

, that guarantee that there exists a parameter

r

such that the union of balls of radius

r

centred at the sample

P

deformation-retracts to

\mathcal{S}

. In the second part of our paper we study homotopy learning in a significantly more general setting -- we investigate sets of positive reach and submanifolds of positive reach embedded in a \emph{Riemannian manifold with bounded sectional curvature}. To this end we introduce a new version of the reach in the Riemannian setting inspired by the cut locus. Yet again, we provide tight bounds on

\varepsilon

and

\delta

for both cases (submanifolds as well as sets of positive reach), exhibiting the tightness by an explicit construction.Comment: 74 pages, 29 figure

arXiv.org e-Print Archive