1,859 research outputs found
Interpretable statistics for complex modelling: quantile and topological learning
As the complexity of our data increased exponentially in the last decades, so has our
need for interpretable features. This thesis revolves around two paradigms to approach
this quest for insights.
In the first part we focus on parametric models, where the problem of interpretability
can be seen as a “parametrization selection”. We introduce a quantile-centric
parametrization and we show the advantages of our proposal in the context of regression,
where it allows to bridge the gap between classical generalized linear (mixed)
models and increasingly popular quantile methods.
The second part of the thesis, concerned with topological learning, tackles the
problem from a non-parametric perspective. As topology can be thought of as a way
of characterizing data in terms of their connectivity structure, it allows to represent
complex and possibly high dimensional through few features, such as the number of
connected components, loops and voids. We illustrate how the emerging branch of
statistics devoted to recovering topological structures in the data, Topological Data
Analysis, can be exploited both for exploratory and inferential purposes with a special
emphasis on kernels that preserve the topological information in the data.
Finally, we show with an application how these two approaches can borrow strength
from one another in the identification and description of brain activity through fMRI
data from the ABIDE project
The Importance of Forgetting: Limiting Memory Improves Recovery of Topological Characteristics from Neural Data
We develop of a line of work initiated by Curto and Itskov towards
understanding the amount of information contained in the spike trains of
hippocampal place cells via topology considerations. Previously, it was
established that simply knowing which groups of place cells fire together in an
animal's hippocampus is sufficient to extract the global topology of the
animal's physical environment. We model a system where collections of place
cells group and ungroup according to short-term plasticity rules. In
particular, we obtain the surprising result that in experiments with spurious
firing, the accuracy of the extracted topological information decreases with
the persistence (beyond a certain regime) of the cell groups. This suggests
that synaptic transience, or forgetting, is a mechanism by which the brain
counteracts the effects of spurious place cell activity
Persistence Flamelets: multiscale Persistent Homology for kernel density exploration
In recent years there has been noticeable interest in the study of the "shape
of data". Among the many ways a "shape" could be defined, topology is the most
general one, as it describes an object in terms of its connectivity structure:
connected components (topological features of dimension 0), cycles (features of
dimension 1) and so on. There is a growing number of techniques, generally
denoted as Topological Data Analysis, aimed at estimating topological
invariants of a fixed object; when we allow this object to change, however,
little has been done to investigate the evolution in its topology. In this work
we define the Persistence Flamelets, a multiscale version of one of the most
popular tool in TDA, the Persistence Landscape. We examine its theoretical
properties and we show how it could be used to gain insights on KDEs bandwidth
parameter
Linear-Size Approximations to the Vietoris-Rips Filtration
The Vietoris-Rips filtration is a versatile tool in topological data
analysis. It is a sequence of simplicial complexes built on a metric space to
add topological structure to an otherwise disconnected set of points. It is
widely used because it encodes useful information about the topology of the
underlying metric space. This information is often extracted from its so-called
persistence diagram. Unfortunately, this filtration is often too large to
construct in full. We show how to construct an O(n)-size filtered simplicial
complex on an -point metric space such that its persistence diagram is a
good approximation to that of the Vietoris-Rips filtration. This new filtration
can be constructed in time. The constant factors in both the size
and the running time depend only on the doubling dimension of the metric space
and the desired tightness of the approximation. For the first time, this makes
it computationally tractable to approximate the persistence diagram of the
Vietoris-Rips filtration across all scales for large data sets.
We describe two different sparse filtrations. The first is a zigzag
filtration that removes points as the scale increases. The second is a
(non-zigzag) filtration that yields the same persistence diagram. Both methods
are based on a hierarchical net-tree and yield the same guarantees
- …