12 research outputs found

    Bayesian perspectives on statistical modelling

    Get PDF
    This thesis explores the representation of probability measures in a coherent Bayesian modelling framework, together with the ensuing characterisation properties of posterior functionals. First, a decision theoretic approach is adopted to provide a unified modelling criterion applicable to assessing prior-likelihood combinations, design matrices, model dimensionality and choice of sample size. The utility structure and associated Bayes risk induces a distance measure, introducing concepts from differential geometry to aid in the interpretation of modelling characteristics. Secondly, analytical and approximate computations for the implementation of the Bayesian paradigm, based on the properties of the class of transformation models, are discussed. Finally, relationships between distance measures (in the form of either a derivative of a Bayes mapping or an induced distance) are explored, with particular reference to the construction of sensitivity measures

    Bayesian perspectives on statistical modelling

    Get PDF
    This thesis explores the representation of probability measures in a coherent Bayesian modelling framework, together with the ensuing characterisation properties of posterior functionals. First, a decision theoretic approach is adopted to provide a unified modelling criterion applicable to assessing prior-likelihood combinations, design matrices, model dimensionality and choice of sample size. The utility structure and associated Bayes risk induces a distance measure, introducing concepts from differential geometry to aid in the interpretation of modelling characteristics. Secondly, analytical and approximate computations for the implementation of the Bayesian paradigm, based on the properties of the class of transformation models, are discussed. Finally, relationships between distance measures (in the form of either a derivative of a Bayes mapping or an induced distance) are explored, with particular reference to the construction of sensitivity measures

    Statistical inference for generative models with maximum mean discrepancy

    Get PDF
    While likelihood-based inference and its variants provide a statistically efficient and widely applicable approach to parametric inference, their application to models involving intractable likelihoods poses challenges. In this work, we study a class of minimum distance estimators for intractable generative models, that is, statistical models for which the likelihood is intractable, but simulation is cheap. The distance considered, maximum mean discrepancy (MMD), is defined through the embedding of probability measures into a reproducing kernel Hilbert space. We study the theoretical properties of these estimators, showing that they are consistent, asymptotically normal and robust to model misspecification. A main advantage of these estimators is the flexibility offered by the choice of kernel, which can be used to trade-off statistical efficiency and robustness. On the algorithmic side, we study the geometry induced by MMD on the parameter space and use this to introduce a novel natural gradient descent-like algorithm for efficient implementation of these estimators. We illustrate the relevance of our theoretical results on several classes of models including a discrete-time latent Markov process and two multivariate stochastic differential equation models

    Statistical Inference for Generative Models with Maximum Mean Discrepancy

    Get PDF
    While likelihood-based inference and its variants provide a statistically efficient and widely applicable approach to parametric inference, their application to models involving intractable likelihoods poses challenges. In this work, we study a class of minimum distance estimators for intractable generative models, that is, statistical models for which the likelihood is intractable, but simulation is cheap. The distance considered, maximum mean discrepancy (MMD), is defined through the embedding of probability measures into a reproducing kernel Hilbert space. We study the theoretical properties of these estimators, showing that they are consistent, asymptotically normal and robust to model misspecification. A main advantage of these estimators is the flexibility offered by the choice of kernel, which can be used to trade-off statistical efficiency and robustness. On the algorithmic side, we study the geometry induced by MMD on the parameter space and use this to introduce a novel natural gradient descent-like algorithm for efficient implementation of these estimators. We illustrate the relevance of our theoretical results on several classes of models including a discrete-time latent Markov process and two multivariate stochastic differential equation models

    The geometry of off-the-grid compressed sensing

    Get PDF
    International audienceThis paper presents a sharp geometric analysis of the recovery performance of sparse regularization. More specifically, we analyze the BLASSO method which estimates a sparse measure (sum of Dirac masses) from randomized sub-sampled measurements. This is a "continuous", often called off-the-grid, extension of the compressed sensing problem, where the â„“1\ell^1 norm is replaced by the total variation of measures. This extension is appealing from a numerical perspective because it avoids to discretize the the space by some grid. But more importantly, it makes explicit the geometry of the problem since the positions of the Diracs can now freely move over the parameter space. On a methodological level, our contribution is to propose the Fisher geodesic distance on this parameter space as the canonical metric to analyze super-resolution in a way which is invariant to reparameterization of this space. Switching to the Fisher metric allows us to take into account measurement operators which are not translation invariant, which is crucial for applications such as Laplace inversion in imaging, Gaussian mixtures estimation and training of multilayer perceptrons with one hidden layer. On a theoretical level, our main contribution shows that if the Fisher distance between spikes is larger than a Rayleigh separation constant, then the BLASSO recovers in a stable way a stream of Diracs, provided that the number of measurements is proportional (up to log factors) to the number of Diracs. We measure the stability using an optimal transport distance constructed on top of the Fisher geodesic distance. Our result is (up to log factor) sharp and does not require any randomness assumption on the amplitudes of the underlying measure. Our proof technique relies on an infinite-dimensional extension of the so-called "golfing scheme" which operates over the space of measures and is of general interest

    Causal Models over Infinite Graphs and their Application to the Sensorimotor Loop: Causal Models over Infinite Graphs and their Application to theSensorimotor Loop: General Stochastic Aspects and GradientMethods for Optimal Control

    Get PDF
    Motivation and background The enormous amount of capabilities that every human learns throughout his life, is probably among the most remarkable and fascinating aspects of life. Learning has therefore drawn lots of interest from scientists working in very different fields like philosophy, biology, sociology, educational sciences, computer sciences and mathematics. This thesis focuses on the information theoretical and mathematical aspects of learning. We are interested in the learning process of an agent (which can be for example a human, an animal, a robot, an economical institution or a state) that interacts with its environment. Common models for this interaction are Markov decision processes (MDPs) and partially observable Markov decision processes (POMDPs). Learning is then considered to be the maximization of the expectation of a predefined reward function. In order to formulate general principles (like a formal definition of curiosity-driven learning or avoidance of unpleasant situation) in a rigorous way, it might be desirable to have a theoretical framework for the optimization of more complex functionals of the underlying process law. This might include the entropy of certain sensor values or their mutual information. An optimization of the latter quantity (also known as predictive information) has been investigated intensively both theoretically and experimentally using computer simulations by N. Ay, R. Der, K Zahedi and G. Martius. In this thesis, we develop a mathematical theory for learning in the sensorimotor loop beyond expected reward maximization. Approaches and results This thesis covers four different topics related to the theory of learning in the sensorimotor loop. First of all, we need to specify the model of an agent interacting with the environment, either with learning or without learning. This interaction naturally results in complex causal dependencies. Since we are interested in asymptotic properties of learning algorithms, it is necessary to consider infinite time horizons. It turns out that the well-understood theory of causal networks known from the machine learning literature is not powerful enough for our purpose. Therefore we extend important theorems on causal networks to infinite graphs and general state spaces using analytical methods from measure theoretic probability theory and the theory of discrete time stochastic processes. Furthermore, we prove a generalization of the strong Markov property from Markov processes to infinite causal networks. Secondly, we develop a new idea for a projected stochastic constraint optimization algorithm. Generally a discrete gradient ascent algorithm can be used to generate an iterative sequence that converges to the stationary points of a given optimization problem. Whenever the optimization takes place over a compact subset of a vector space, it is possible that the iterative sequence leaves the constraint set. One possibility to cope with this problem is to project all points to the constraint set using Euclidean best-approximation. The latter is sometimes difficult to calculate. A concrete example is an optimization over the unit ball in a matrix space equipped with operator norm. Our idea consists of a back-projection using quasi-projectors different from the Euclidean best-approximation. In the matrix example, there is another canonical way to force the iterative sequence to stay in the constraint set: Whenever a point leaves the unit ball, it is divided by its norm. For a given target function, this procedure might introduce spurious stationary points on the boundary. We show that this problem can be circumvented by using a gradient that is tailored to the quasi-projector used for back-projection. We state a general technical compatibility condition between a quasi-projector and a metric used for gradient ascent, prove convergence of stochastic iterative sequences and provide an appropriate metric for the unit-ball example. Thirdly, a class of learning problems in the sensorimotor loop is defined and motivated. This class of problems is more general than the usual expected reward maximization and is illustrated by numerous examples (like expected reward maximization, maximization of the predictive information, maximization of the entropy and minimization of the variance of a given reward function). We also provide stationarity conditions together with appropriate gradient formulas. Last but not least, we prove convergence of a stochastic optimization algorithm (as considered in the second topic) applied to a general learning problem (as considered in the third topic). It is shown that the learning algorithm converges to the set of stationary points. Among others, the proof covers the convergence of an improved version of an algorithm for the maximization of the predictive information as proposed by N. Ay, R. Der and K. Zahedi. We also investigate an application to a linear Gaussian dynamic, where the policies are encoded by the unit-ball in a space of matrices equipped with operator norm

    The 40th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering

    Get PDF
    These proceedings aim to collect the ideas presented, discussed, and disputed at the 40th Workshop on Bayesian Inference and Maximum Entropy, MaxEnt 2021. Skilling and Knuth seek to rebuild the foundations of quantum mechanics from probability theory, and Caticha competes in that endeavour with a very different entropy-based approach. Costa connects entropy with general relativity, Pessoa reports new insights on ecology and Yousefi derives classical density functional theory, both through the maximum entropy principle. Von Toussaint, Preuss, Albert, Rath, Ranftl and Kvas report the latest developments in regression and surrogate-based inference with applications to optimization and inverse problems in plasma physics, biomechanics and geodesy. Van Soom presents new priors for phonetics, Stern et al. propose a new haphazard sampling method, and Kelter uncovers two measure theoretic iss phonetics ues with hypothesis testing

    Differential geometric MCMC methods and applications

    Get PDF
    This thesis presents novel Markov chain Monte Carlo methodology that exploits the natural representation of a statistical model as a Riemannian manifold. The methods developed provide generalisations of the Metropolis-adjusted Langevin algorithm and the Hybrid Monte Carlo algorithm for Bayesian statistical inference, and resolve many shortcomings of existing Monte Carlo algorithms when sampling from target densities that may be high dimensional and exhibit strong correlation structure. The performance of these Riemannian manifold Markov chain Monte Carlo algorithms is rigorously assessed by performing Bayesian inference on logistic regression models, log-Gaussian Cox point process models, stochastic volatility models, and both parameter and model level inference of dynamical systems described by nonlinear differential equations
    corecore