10,561 research outputs found
A categorical foundation for Bayesian probability
Given two measurable spaces and with countably generated
-algebras, a perfect prior probability measure on and a
sampling distribution , there is a corresponding inference
map which is unique up to a set of measure zero. Thus,
given a data measurement , a posterior probability
can be computed. This procedure is iterative: with
each updated probability , we obtain a new joint distribution which in
turn yields a new inference map and the process repeats with each
additional measurement. The main result uses an existence theorem for regular
conditional probabilities by Faden, which holds in more generality than the
setting of Polish spaces. This less stringent setting then allows for
non-trivial decision rules (Eilenberg--Moore algebras) on finite (as well as
non finite) spaces, and also provides for a common framework for decision
theory and Bayesian probability.Comment: 15 pages; revised setting to more clearly explain how to incorporate
perfect measures and the Giry monad; to appear in Applied Categorical
Structure
Interpretable statistics for complex modelling: quantile and topological learning
As the complexity of our data increased exponentially in the last decades, so has our
need for interpretable features. This thesis revolves around two paradigms to approach
this quest for insights.
In the first part we focus on parametric models, where the problem of interpretability
can be seen as a “parametrization selection”. We introduce a quantile-centric
parametrization and we show the advantages of our proposal in the context of regression,
where it allows to bridge the gap between classical generalized linear (mixed)
models and increasingly popular quantile methods.
The second part of the thesis, concerned with topological learning, tackles the
problem from a non-parametric perspective. As topology can be thought of as a way
of characterizing data in terms of their connectivity structure, it allows to represent
complex and possibly high dimensional through few features, such as the number of
connected components, loops and voids. We illustrate how the emerging branch of
statistics devoted to recovering topological structures in the data, Topological Data
Analysis, can be exploited both for exploratory and inferential purposes with a special
emphasis on kernels that preserve the topological information in the data.
Finally, we show with an application how these two approaches can borrow strength
from one another in the identification and description of brain activity through fMRI
data from the ABIDE project
Uncertainty in phylogenetic tree estimates
Estimating phylogenetic trees is an important problem in evolutionary
biology, environmental policy and medicine. Although trees are estimated, their
uncertainties are discarded by mathematicians working in tree space. Here we
explicitly model the multivariate uncertainty of tree estimates. We consider
both the cases where uncertainty information arises extrinsically (through
covariate information) and intrinsically (through the tree estimates
themselves). The importance of accounting for tree uncertainty in tree space is
demonstrated in two case studies. In the first instance, differences between
gene trees are small relative to their uncertainties, while in the second, the
differences are relatively large. Our main goal is visualization of tree
uncertainty, and we demonstrate advantages of our method with respect to
reproducibility, speed and preservation of topological differences compared to
visualization based on multidimensional scaling. The proposal highlights that
phylogenetic trees are estimated in an extremely high-dimensional space,
resulting in uncertainty information that cannot be discarded. Most
importantly, it is a method that allows biologists to diagnose whether
differences between gene trees are biologically meaningful, or due to
uncertainty in estimation.Comment: Final version accepted to Journal of Computational and Graphical
Statistic
Picturing classical and quantum Bayesian inference
We introduce a graphical framework for Bayesian inference that is
sufficiently general to accommodate not just the standard case but also recent
proposals for a theory of quantum Bayesian inference wherein one considers
density operators rather than probability distributions as representative of
degrees of belief. The diagrammatic framework is stated in the graphical
language of symmetric monoidal categories and of compact structures and
Frobenius structures therein, in which Bayesian inversion boils down to
transposition with respect to an appropriate compact structure. We characterize
classical Bayesian inference in terms of a graphical property and demonstrate
that our approach eliminates some purely conventional elements that appear in
common representations thereof, such as whether degrees of belief are
represented by probabilities or entropic quantities. We also introduce a
quantum-like calculus wherein the Frobenius structure is noncommutative and
show that it can accommodate Leifer's calculus of `conditional density
operators'. The notion of conditional independence is also generalized to our
graphical setting and we make some preliminary connections to the theory of
Bayesian networks. Finally, we demonstrate how to construct a graphical
Bayesian calculus within any dagger compact category.Comment: 38 pages, lots of picture
Diffusion Variational Autoencoders
A standard Variational Autoencoder, with a Euclidean latent space, is
structurally incapable of capturing topological properties of certain datasets.
To remove topological obstructions, we introduce Diffusion Variational
Autoencoders with arbitrary manifolds as a latent space. A Diffusion
Variational Autoencoder uses transition kernels of Brownian motion on the
manifold. In particular, it uses properties of the Brownian motion to implement
the reparametrization trick and fast approximations to the KL divergence. We
show that the Diffusion Variational Autoencoder is capable of capturing
topological properties of synthetic datasets. Additionally, we train MNIST on
spheres, tori, projective spaces, SO(3), and a torus embedded in R3. Although a
natural dataset like MNIST does not have latent variables with a clear-cut
topological structure, training it on a manifold can still highlight
topological and geometrical properties.Comment: 10 pages, 8 figures Added an appendix with derivation of asymptotic
expansion of KL divergence for heat kernel on arbitrary Riemannian manifolds,
and an appendix with new experiments on binarized MNIST. Added a previously
missing factor in the asymptotic expansion of the heat kernel and corrected a
coefficient in asymptotic expansion KL divergence; further minor edit
Inference In The Space Of Topological Maps: An MCMC-based Approach
©2004 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.Presented at the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 28 September-2 October 2004, Sendai, Japan.DOI: 10.1109/IROS.2004.1389611While probabilistic techniques have been considered
extensively in the context of metric maps, no general
purpose probabilistic methods exist for topological maps.
We present the concept of Probabilistic Topological Maps
(PTMs), a sample-based representation that approximates
the posterior distribution over topologies given the available
sensor measurements. The PTM is obtained through the
use of MCMC-based Bayesian inference over the space of
all possible topologies. It is shown that the space of all
topologies is equivalent to the space of set partitions of all
available measurements. While the space of possible topologies
is intractably large, our use of Markov chain Monte Carlo
sampling to infer the approximate histograms overcomes the
combinatorial nature of this space and provides a general
solution to the correspondence problem in the context of
topological mapping. We present experimental results that
validate our technique and generate good maps even when
using only odometry as the sensor measurements
- …