3,886 research outputs found
PersLay: A Neural Network Layer for Persistence Diagrams and New Graph Topological Signatures
Persistence diagrams, the most common descriptors of Topological Data
Analysis, encode topological properties of data and have already proved pivotal
in many different applications of data science. However, since the (metric)
space of persistence diagrams is not Hilbert, they end up being difficult
inputs for most Machine Learning techniques. To address this concern, several
vectorization methods have been put forward that embed persistence diagrams
into either finite-dimensional Euclidean space or (implicit) infinite
dimensional Hilbert space with kernels. In this work, we focus on persistence
diagrams built on top of graphs. Relying on extended persistence theory and the
so-called heat kernel signature, we show how graphs can be encoded by
(extended) persistence diagrams in a provably stable way. We then propose a
general and versatile framework for learning vectorizations of persistence
diagrams, which encompasses most of the vectorization techniques used in the
literature. We finally showcase the experimental strength of our setup by
achieving competitive scores on classification tasks on real-life graph
datasets
Local Equivalence and Intrinsic Metrics between Reeb Graphs
As graphical summaries for topological spaces and maps, Reeb graphs are
common objects in the computer graphics or topological data analysis
literature. Defining good metrics between these objects has become an important
question for applications, where it matters to quantify the extent by which two
given Reeb graphs differ. Recent contributions emphasize this aspect, proposing
novel distances such as {\em functional distortion} or {\em interleaving} that
are provably more discriminative than the so-called {\em bottleneck distance},
being true metrics whereas the latter is only a pseudo-metric. Their main
drawback compared to the bottleneck distance is to be comparatively hard (if at
all possible) to evaluate. Here we take the opposite view on the problem and
show that the bottleneck distance is in fact good enough {\em locally}, in the
sense that it is able to discriminate a Reeb graph from any other Reeb graph in
a small enough neighborhood, as efficiently as the other metrics do. This
suggests considering the {\em intrinsic metrics} induced by these distances,
which turn out to be all {\em globally} equivalent. This novel viewpoint on the
study of Reeb graphs has a potential impact on applications, where one may not
only be interested in discriminating between data but also in interpolating
between them
Interpretable statistics for complex modelling: quantile and topological learning
As the complexity of our data increased exponentially in the last decades, so has our
need for interpretable features. This thesis revolves around two paradigms to approach
this quest for insights.
In the first part we focus on parametric models, where the problem of interpretability
can be seen as a “parametrization selection”. We introduce a quantile-centric
parametrization and we show the advantages of our proposal in the context of regression,
where it allows to bridge the gap between classical generalized linear (mixed)
models and increasingly popular quantile methods.
The second part of the thesis, concerned with topological learning, tackles the
problem from a non-parametric perspective. As topology can be thought of as a way
of characterizing data in terms of their connectivity structure, it allows to represent
complex and possibly high dimensional through few features, such as the number of
connected components, loops and voids. We illustrate how the emerging branch of
statistics devoted to recovering topological structures in the data, Topological Data
Analysis, can be exploited both for exploratory and inferential purposes with a special
emphasis on kernels that preserve the topological information in the data.
Finally, we show with an application how these two approaches can borrow strength
from one another in the identification and description of brain activity through fMRI
data from the ABIDE project
- …