1,260 research outputs found
Rates of convergence for robust geometric inference
Distances to compact sets are widely used in the field of Topological Data
Analysis for inferring geometric and topological features from point clouds. In
this context, the distance to a probability measure (DTM) has been introduced
by Chazal et al. (2011) as a robust alternative to the distance a compact set.
In practice, the DTM can be estimated by its empirical counterpart, that is the
distance to the empirical measure (DTEM). In this paper we give a tight control
of the deviation of the DTEM. Our analysis relies on a local analysis of
empirical processes. In particular, we show that the rates of convergence of
the DTEM directly depends on the regularity at zero of a particular quantile
fonction which contains some local information about the geometry of the
support. This quantile function is the relevant quantity to describe precisely
how difficult is a geometric inference problem. Several numerical experiments
illustrate the convergence of the DTEM and also confirm that our bounds are
tight
DTM-based Filtrations
Despite strong stability properties, the persistent homology of filtrations
classically used in Topological Data Analysis, such as, e.g. the Cech or
Vietoris-Rips filtrations, are very sensitive to the presence of outliers in
the data from which they are computed. In this paper, we introduce and study a
new family of filtrations, the DTM-filtrations, built on top of point clouds in
the Euclidean space which are more robust to noise and outliers. The approach
adopted in this work relies on the notion of distance-to-measure functions, and
extends some previous work on the approximation of such functions.Comment: Abel Symposia, Springer, In press, Topological Data Analysi
A Statistical Approach to Topological Data Analysis
Until very recently, topological data analysis and topological inference methods mostlyrelied on deterministic approaches. The major part of this habilitation thesis presents astatistical approach to such topological methods. We first develop model selection toolsfor selecting simplicial complexes in a given filtration. Next, we study the estimationof persistent homology on metric spaces. We also study a robust version of topologicaldata analysis. Related to this last topic, we also investigate the problem of Wassersteindeconvolution. The second part of the habilitation thesis gathers our contributions inother fields of statistics, including a model selection method for Gaussian mixtures, animplementation of the slope heuristic for calibrating penalties, and a study of Breiman’spermutation importance measure in the context of random forests
A Unifying Model of Genome Evolution Under Parsimony
We present a data structure called a history graph that offers a practical
basis for the analysis of genome evolution. It conceptually simplifies the
study of parsimonious evolutionary histories by representing both substitutions
and double cut and join (DCJ) rearrangements in the presence of duplications.
The problem of constructing parsimonious history graphs thus subsumes related
maximum parsimony problems in the fields of phylogenetic reconstruction and
genome rearrangement. We show that tractable functions can be used to define
upper and lower bounds on the minimum number of substitutions and DCJ
rearrangements needed to explain any history graph. These bounds become tight
for a special type of unambiguous history graph called an ancestral variation
graph (AVG), which constrains in its combinatorial structure the number of
operations required. We finally demonstrate that for a given history graph ,
a finite set of AVGs describe all parsimonious interpretations of , and this
set can be explored with a few sampling moves.Comment: 52 pages, 24 figure
Tight bounds for the learning of homotopy \`a la Niyogi, Smale, and Weinberger for subsets of Euclidean spaces and of Riemannian manifolds
In this article we extend and strengthen the seminal work by Niyogi, Smale,
and Weinberger on the learning of the homotopy type from a sample of an
underlying space. In their work, Niyogi, Smale, and Weinberger studied samples
of manifolds with positive reach embedded in . We extend
their results in the following ways: In the first part of our paper we consider
both manifolds of positive reach -- a more general setting than manifolds
-- and sets of positive reach embedded in . The sample of
such a set does not have to lie directly on it. Instead, we
assume that the two one-sided Hausdorff distances -- and
-- between and are bounded. We provide explicit bounds in
terms of and , that guarantee that there exists a
parameter such that the union of balls of radius centred at the sample
deformation-retracts to .
In the second part of our paper we study homotopy learning in a significantly
more general setting -- we investigate sets of positive reach and submanifolds
of positive reach embedded in a \emph{Riemannian manifold with bounded
sectional curvature}. To this end we introduce a new version of the reach in
the Riemannian setting inspired by the cut locus. Yet again, we provide tight
bounds on and for both cases (submanifolds as well as
sets of positive reach), exhibiting the tightness by an explicit construction.Comment: 74 pages, 29 figure
- …