133 research outputs found
Recent advances in directional statistics
Mainstream statistical methodology is generally applicable to data observed
in Euclidean space. There are, however, numerous contexts of considerable
scientific interest in which the natural supports for the data under
consideration are Riemannian manifolds like the unit circle, torus, sphere and
their extensions. Typically, such data can be represented using one or more
directions, and directional statistics is the branch of statistics that deals
with their analysis. In this paper we provide a review of the many recent
developments in the field since the publication of Mardia and Jupp (1999),
still the most comprehensive text on directional statistics. Many of those
developments have been stimulated by interesting applications in fields as
diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics,
image analysis, text mining, environmetrics, and machine learning. We begin by
considering developments for the exploratory analysis of directional data
before progressing to distributional models, general approaches to inference,
hypothesis testing, regression, nonparametric curve estimation, methods for
dimension reduction, classification and clustering, and the modelling of time
series, spatial and spatio-temporal data. An overview of currently available
software for analysing directional data is also provided, and potential future
developments discussed.Comment: 61 page
Probabilistic Riemannian submanifold learning with wrapped Gaussian process latent variable models
Latent variable models (LVMs) learn probabilistic models of data manifolds
lying in an \emph{ambient} Euclidean space. In a number of applications, a
priori known spatial constraints can shrink the ambient space into a
considerably smaller manifold. Additionally, in these applications the
Euclidean geometry might induce a suboptimal similarity measure, which could be
improved by choosing a different metric. Euclidean models ignore such
information and assign probability mass to data points that can never appear as
data, and vastly different likelihoods to points that are similar under the
desired metric. We propose the wrapped Gaussian process latent variable model
(WGPLVM), that extends Gaussian process latent variable models to take values
strictly on a given ambient Riemannian manifold, making the model blind to
impossible data points. This allows non-linear, probabilistic inference of
low-dimensional Riemannian submanifolds from data. Our evaluation on diverse
datasets show that we improve performance on several tasks, including encoding,
visualization and uncertainty quantification
Probabilistic Riemannian submanifold learning with wrapped Gaussian process latent variable models
Latent variable models (LVMs) learn probabilistic models of data manifolds
lying in an \emph{ambient} Euclidean space. In a number of applications, a
priori known spatial constraints can shrink the ambient space into a
considerably smaller manifold. Additionally, in these applications the
Euclidean geometry might induce a suboptimal similarity measure, which could be
improved by choosing a different metric. Euclidean models ignore such
information and assign probability mass to data points that can never appear as
data, and vastly different likelihoods to points that are similar under the
desired metric. We propose the wrapped Gaussian process latent variable model
(WGPLVM), that extends Gaussian process latent variable models to take values
strictly on a given ambient Riemannian manifold, making the model blind to
impossible data points. This allows non-linear, probabilistic inference of
low-dimensional Riemannian submanifolds from data. Our evaluation on diverse
datasets show that we improve performance on several tasks, including encoding,
visualization and uncertainty quantification
Second Order Differences of Cyclic Data and Applications in Variational Denoising
In many image and signal processing applications, as interferometric
synthetic aperture radar (SAR), electroencephalogram (EEG) data analysis or
color image restoration in HSV or LCh spaces the data has its range on the
one-dimensional sphere . Although the minimization of total
variation (TV) regularized functionals is among the most popular methods for
edge-preserving image restoration such methods were only very recently applied
to cyclic structures. However, as for Euclidean data, TV regularized
variational methods suffer from the so called staircasing effect. This effect
can be avoided by involving higher order derivatives into the functional.
This is the first paper which uses higher order differences of cyclic data in
regularization terms of energy functionals for image restoration. We introduce
absolute higher order differences for -valued data in a sound way
which is independent of the chosen representation system on the circle. Our
absolute cyclic first order difference is just the geodesic distance between
points. Similar to the geodesic distances the absolute cyclic second order
differences have only values in [0,{\pi}]. We update the cyclic variational TV
approach by our new cyclic second order differences. To minimize the
corresponding functional we apply a cyclic proximal point method which was
recently successfully proposed for Hadamard manifolds. Choosing appropriate
cycles this algorithm can be implemented in an efficient way. The main steps
require the evaluation of proximal mappings of our cyclic differences for which
we provide analytical expressions. Under certain conditions we prove the
convergence of our algorithm. Various numerical examples with artificial as
well as real-world data demonstrate the advantageous performance of our
algorithm.Comment: 32 pages, 16 figures, shortened version of submitted manuscrip
Gaussian Differential Privacy on Riemannian Manifolds
We develop an advanced approach for extending Gaussian Differential Privacy
(GDP) to general Riemannian manifolds. The concept of GDP stands out as a
prominent privacy definition that strongly warrants extension to manifold
settings, due to its central limit properties. By harnessing the power of the
renowned Bishop-Gromov theorem in geometric analysis, we propose a Riemannian
Gaussian distribution that integrates the Riemannian distance, allowing us to
achieve GDP in Riemannian manifolds with bounded Ricci curvature. To the best
of our knowledge, this work marks the first instance of extending the GDP
framework to accommodate general Riemannian manifolds, encompassing curved
spaces, and circumventing the reliance on tangent space summaries. We provide a
simple algorithm to evaluate the privacy budget on any one-dimensional
manifold and introduce a versatile Markov Chain Monte Carlo (MCMC)-based
algorithm to calculate on any Riemannian manifold with constant
curvature. Through simulations on one of the most prevalent manifolds in
statistics, the unit sphere , we demonstrate the superior utility of our
Riemannian Gaussian mechanism in comparison to the previously proposed
Riemannian Laplace mechanism for implementing GDP
A Wasserstein-type Distance for Gaussian Mixtures on Vector Bundles with Applications to Shape Analysis
This paper uses sample data to study the problem of comparing populations on
finite-dimensional parallelizable Riemannian manifolds and more general trivial
vector bundles. Utilizing triviality, our framework represents populations as
mixtures of Gaussians on vector bundles and estimates the population parameters
using a mode-based clustering algorithm. We derive a Wasserstein-type metric
between Gaussian mixtures, adapted to the manifold geometry, in order to
compare estimated distributions. Our contributions include an identifiability
result for Gaussian mixtures on manifold domains and a convenient
characterization of optimal couplings of Gaussian mixtures under the derived
metric. We demonstrate these tools on some example domains, including the
pre-shape space of planar closed curves, with applications to the shape space
of triangles and populations of nanoparticles. In the nanoparticle application,
we consider a sequence of populations of particle shapes arising from a
manufacturing process, and utilize the Wasserstein-type distance to perform
change-point detection
A Second Order TV-type Approach for Inpainting and Denoising Higher Dimensional Combined Cyclic and Vector Space Data
In this paper we consider denoising and inpainting problems for higher
dimensional combined cyclic and linear space valued data. These kind of data
appear when dealing with nonlinear color spaces such as HSV, and they can be
obtained by changing the space domain of, e.g., an optical flow field to polar
coordinates. For such nonlinear data spaces, we develop algorithms for the
solution of the corresponding second order total variation (TV) type problems
for denoising, inpainting as well as the combination of both. We provide a
convergence analysis and we apply the algorithms to concrete problems.Comment: revised submitted versio
Hyperbolic Deep Neural Networks: A Survey
Recently, there has been a rising surge of momentum for deep representation
learning in hyperbolic spaces due to theirhigh capacity of modeling data like
knowledge graphs or synonym hierarchies, possessing hierarchical structure. We
refer to the model as hyperbolic deep neural network in this paper. Such a
hyperbolic neural architecture potentially leads to drastically compact model
withmuch more physical interpretability than its counterpart in Euclidean
space. To stimulate future research, this paper presents acoherent and
comprehensive review of the literature around the neural components in the
construction of hyperbolic deep neuralnetworks, as well as the generalization
of the leading deep approaches to the Hyperbolic space. It also presents
current applicationsaround various machine learning tasks on several publicly
available datasets, together with insightful observations and identifying
openquestions and promising future directions
Isometric Gaussian Process Latent Variable Model for Dissimilarity Data
We present a probabilistic model where the latent variable respects both the
distances and the topology of the modeled data. The model leverages the
Riemannian geometry of the generated manifold to endow the latent space with a
well-defined stochastic distance measure, which is modeled locally as Nakagami
distributions. These stochastic distances are sought to be as similar as
possible to observed distances along a neighborhood graph through a censoring
process. The model is inferred by variational inference based on observations
of pairwise distances. We demonstrate how the new model can encode invariances
in the learned manifolds.Comment: ICML 202
- …