133 research outputs found

    Recent advances in directional statistics

    Get PDF
    Mainstream statistical methodology is generally applicable to data observed in Euclidean space. There are, however, numerous contexts of considerable scientific interest in which the natural supports for the data under consideration are Riemannian manifolds like the unit circle, torus, sphere and their extensions. Typically, such data can be represented using one or more directions, and directional statistics is the branch of statistics that deals with their analysis. In this paper we provide a review of the many recent developments in the field since the publication of Mardia and Jupp (1999), still the most comprehensive text on directional statistics. Many of those developments have been stimulated by interesting applications in fields as diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics, image analysis, text mining, environmetrics, and machine learning. We begin by considering developments for the exploratory analysis of directional data before progressing to distributional models, general approaches to inference, hypothesis testing, regression, nonparametric curve estimation, methods for dimension reduction, classification and clustering, and the modelling of time series, spatial and spatio-temporal data. An overview of currently available software for analysing directional data is also provided, and potential future developments discussed.Comment: 61 page

    Probabilistic Riemannian submanifold learning with wrapped Gaussian process latent variable models

    Full text link
    Latent variable models (LVMs) learn probabilistic models of data manifolds lying in an \emph{ambient} Euclidean space. In a number of applications, a priori known spatial constraints can shrink the ambient space into a considerably smaller manifold. Additionally, in these applications the Euclidean geometry might induce a suboptimal similarity measure, which could be improved by choosing a different metric. Euclidean models ignore such information and assign probability mass to data points that can never appear as data, and vastly different likelihoods to points that are similar under the desired metric. We propose the wrapped Gaussian process latent variable model (WGPLVM), that extends Gaussian process latent variable models to take values strictly on a given ambient Riemannian manifold, making the model blind to impossible data points. This allows non-linear, probabilistic inference of low-dimensional Riemannian submanifolds from data. Our evaluation on diverse datasets show that we improve performance on several tasks, including encoding, visualization and uncertainty quantification

    Probabilistic Riemannian submanifold learning with wrapped Gaussian process latent variable models

    Full text link
    Latent variable models (LVMs) learn probabilistic models of data manifolds lying in an \emph{ambient} Euclidean space. In a number of applications, a priori known spatial constraints can shrink the ambient space into a considerably smaller manifold. Additionally, in these applications the Euclidean geometry might induce a suboptimal similarity measure, which could be improved by choosing a different metric. Euclidean models ignore such information and assign probability mass to data points that can never appear as data, and vastly different likelihoods to points that are similar under the desired metric. We propose the wrapped Gaussian process latent variable model (WGPLVM), that extends Gaussian process latent variable models to take values strictly on a given ambient Riemannian manifold, making the model blind to impossible data points. This allows non-linear, probabilistic inference of low-dimensional Riemannian submanifolds from data. Our evaluation on diverse datasets show that we improve performance on several tasks, including encoding, visualization and uncertainty quantification

    Second Order Differences of Cyclic Data and Applications in Variational Denoising

    Full text link
    In many image and signal processing applications, as interferometric synthetic aperture radar (SAR), electroencephalogram (EEG) data analysis or color image restoration in HSV or LCh spaces the data has its range on the one-dimensional sphere S1\mathbb S^1. Although the minimization of total variation (TV) regularized functionals is among the most popular methods for edge-preserving image restoration such methods were only very recently applied to cyclic structures. However, as for Euclidean data, TV regularized variational methods suffer from the so called staircasing effect. This effect can be avoided by involving higher order derivatives into the functional. This is the first paper which uses higher order differences of cyclic data in regularization terms of energy functionals for image restoration. We introduce absolute higher order differences for S1\mathbb S^1-valued data in a sound way which is independent of the chosen representation system on the circle. Our absolute cyclic first order difference is just the geodesic distance between points. Similar to the geodesic distances the absolute cyclic second order differences have only values in [0,{\pi}]. We update the cyclic variational TV approach by our new cyclic second order differences. To minimize the corresponding functional we apply a cyclic proximal point method which was recently successfully proposed for Hadamard manifolds. Choosing appropriate cycles this algorithm can be implemented in an efficient way. The main steps require the evaluation of proximal mappings of our cyclic differences for which we provide analytical expressions. Under certain conditions we prove the convergence of our algorithm. Various numerical examples with artificial as well as real-world data demonstrate the advantageous performance of our algorithm.Comment: 32 pages, 16 figures, shortened version of submitted manuscrip

    Gaussian Differential Privacy on Riemannian Manifolds

    Full text link
    We develop an advanced approach for extending Gaussian Differential Privacy (GDP) to general Riemannian manifolds. The concept of GDP stands out as a prominent privacy definition that strongly warrants extension to manifold settings, due to its central limit properties. By harnessing the power of the renowned Bishop-Gromov theorem in geometric analysis, we propose a Riemannian Gaussian distribution that integrates the Riemannian distance, allowing us to achieve GDP in Riemannian manifolds with bounded Ricci curvature. To the best of our knowledge, this work marks the first instance of extending the GDP framework to accommodate general Riemannian manifolds, encompassing curved spaces, and circumventing the reliance on tangent space summaries. We provide a simple algorithm to evaluate the privacy budget μ\mu on any one-dimensional manifold and introduce a versatile Markov Chain Monte Carlo (MCMC)-based algorithm to calculate μ\mu on any Riemannian manifold with constant curvature. Through simulations on one of the most prevalent manifolds in statistics, the unit sphere SdS^d, we demonstrate the superior utility of our Riemannian Gaussian mechanism in comparison to the previously proposed Riemannian Laplace mechanism for implementing GDP

    A Wasserstein-type Distance for Gaussian Mixtures on Vector Bundles with Applications to Shape Analysis

    Full text link
    This paper uses sample data to study the problem of comparing populations on finite-dimensional parallelizable Riemannian manifolds and more general trivial vector bundles. Utilizing triviality, our framework represents populations as mixtures of Gaussians on vector bundles and estimates the population parameters using a mode-based clustering algorithm. We derive a Wasserstein-type metric between Gaussian mixtures, adapted to the manifold geometry, in order to compare estimated distributions. Our contributions include an identifiability result for Gaussian mixtures on manifold domains and a convenient characterization of optimal couplings of Gaussian mixtures under the derived metric. We demonstrate these tools on some example domains, including the pre-shape space of planar closed curves, with applications to the shape space of triangles and populations of nanoparticles. In the nanoparticle application, we consider a sequence of populations of particle shapes arising from a manufacturing process, and utilize the Wasserstein-type distance to perform change-point detection

    A Second Order TV-type Approach for Inpainting and Denoising Higher Dimensional Combined Cyclic and Vector Space Data

    Full text link
    In this paper we consider denoising and inpainting problems for higher dimensional combined cyclic and linear space valued data. These kind of data appear when dealing with nonlinear color spaces such as HSV, and they can be obtained by changing the space domain of, e.g., an optical flow field to polar coordinates. For such nonlinear data spaces, we develop algorithms for the solution of the corresponding second order total variation (TV) type problems for denoising, inpainting as well as the combination of both. We provide a convergence analysis and we apply the algorithms to concrete problems.Comment: revised submitted versio

    Hyperbolic Deep Neural Networks: A Survey

    Full text link
    Recently, there has been a rising surge of momentum for deep representation learning in hyperbolic spaces due to theirhigh capacity of modeling data like knowledge graphs or synonym hierarchies, possessing hierarchical structure. We refer to the model as hyperbolic deep neural network in this paper. Such a hyperbolic neural architecture potentially leads to drastically compact model withmuch more physical interpretability than its counterpart in Euclidean space. To stimulate future research, this paper presents acoherent and comprehensive review of the literature around the neural components in the construction of hyperbolic deep neuralnetworks, as well as the generalization of the leading deep approaches to the Hyperbolic space. It also presents current applicationsaround various machine learning tasks on several publicly available datasets, together with insightful observations and identifying openquestions and promising future directions

    Isometric Gaussian Process Latent Variable Model for Dissimilarity Data

    Full text link
    We present a probabilistic model where the latent variable respects both the distances and the topology of the modeled data. The model leverages the Riemannian geometry of the generated manifold to endow the latent space with a well-defined stochastic distance measure, which is modeled locally as Nakagami distributions. These stochastic distances are sought to be as similar as possible to observed distances along a neighborhood graph through a censoring process. The model is inferred by variational inference based on observations of pairwise distances. We demonstrate how the new model can encode invariances in the learned manifolds.Comment: ICML 202
    corecore