510 research outputs found

    Invertible Particle Flow-based Sequential MCMC with extension to Gaussian Mixture noise models

    Get PDF
    Sequential state estimation in non-linear and non-Gaussian state spaces has a wide range of applications in statistics and signal processing. One of the most effective non-linear filtering approaches, particle filtering, suffers from weight degeneracy in high-dimensional filtering scenarios. Several avenues have been pursued to address high-dimensionality. Among these, particle flow particle filters construct effective proposal distributions by using invertible flow to migrate particles continuously from the prior distribution to the posterior, and sequential Markov chain Monte Carlo (SMCMC) methods use a Metropolis-Hastings (MH) accept-reject approach to improve filtering performance. In this paper, we propose to combine the strengths of invertible particle flow and SMCMC by constructing a composite Metropolis-Hastings (MH) kernel within the SMCMC framework using invertible particle flow. In addition, we propose a Gaussian mixture model (GMM)-based particle flow algorithm to construct effective MH kernels for multi-modal distributions. Simulation results show that for high-dimensional state estimation example problems the proposed kernels significantly increase the acceptance rate with minimal additional computational overhead and improve estimation accuracy compared with state-of-the-art filtering algorithms

    Inverse Problems and Data Assimilation

    Full text link
    These notes are designed with the aim of providing a clear and concise introduction to the subjects of Inverse Problems and Data Assimilation, and their inter-relations, together with citations to some relevant literature in this area. The first half of the notes is dedicated to studying the Bayesian framework for inverse problems. Techniques such as importance sampling and Markov Chain Monte Carlo (MCMC) methods are introduced; these methods have the desirable property that in the limit of an infinite number of samples they reproduce the full posterior distribution. Since it is often computationally intensive to implement these methods, especially in high dimensional problems, approximate techniques such as approximating the posterior by a Dirac or a Gaussian distribution are discussed. The second half of the notes cover data assimilation. This refers to a particular class of inverse problems in which the unknown parameter is the initial condition of a dynamical system, and in the stochastic dynamics case the subsequent states of the system, and the data comprises partial and noisy observations of that (possibly stochastic) dynamical system. We will also demonstrate that methods developed in data assimilation may be employed to study generic inverse problems, by introducing an artificial time to generate a sequence of probability measures interpolating from the prior to the posterior

    Variational Bayesian inference with complex geostatistical priors using inverse autoregressive flows

    Get PDF
    We combine inverse autoregressive flows (IAF) and variational Bayesian inference (variational Bayes) in the context of geophysical inversion parameterized with deep generative models encoding complex priors. Variational Bayes approximates the unnormalized posterior distribution parametrically within a given family of distributions by solving an optimization problem. Although prone to bias if the chosen family of distributions is too limited, it provides a computationally-efficient approach that scales well to high-dimensional inverse problems. To enhance the expressiveness of the variational distribution, we explore its combination with IAFs that allow samples from a simple base distribution to be pushed forward through a series of invertible transformations onto an approximate posterior. The IAF is learned by maximizing the lower bound of the evidence (marginal likelihood), which is equivalent to minimizing the Kullback–Leibler divergence between the approximation and the target posterior distribution. In our examples, we use either a deep generative adversarial network (GAN) or a variational autoencoder (VAE) to parameterize complex geostatistical priors. Although previous attempts to perform Gauss–Newton inversion in combination with GANs of the same architecture were proven unsuccessful, the trained IAF provides a good reconstruction of channelized subsurface models for both GAN- and VAE-based inversions using synthetic crosshole ground-penetrating-radar data. For the considered examples, the computational cost of our approach is seven times lower than for Markov chain Monte Carlo (MCMC) inversion. Furthermore, the VAE-based approximations in the latent space are in good agreement. The VAE-based inversion requires only one sample to estimate gradients with respect to the IAF parameters at each iteration, while the GAN-based inversions need more samples and the corresponding posterior approximation is less accurate

    HINT: Hierarchical Invertible Neural Transport for Density Estimation and Bayesian Inference

    Full text link
    A large proportion of recent invertible neural architectures is based on a coupling block design. It operates by dividing incoming variables into two sub-spaces, one of which parameterizes an easily invertible (usually affine) transformation that is applied to the other. While the Jacobian of such a transformation is triangular, it is very sparse and thus may lack expressiveness. This work presents a simple remedy by noting that (affine) coupling can be repeated recursively within the resulting sub-spaces, leading to an efficiently invertible block with dense triangular Jacobian. By formulating our recursive coupling scheme via a hierarchical architecture, HINT allows sampling from a joint distribution p(y,x) and the corresponding posterior p(x|y) using a single invertible network. We demonstrate the power of our method for density estimation and Bayesian inference on a novel data set of 2D shapes in Fourier parameterization, which enables consistent visualization of samples for different dimensionalities

    Flow Annealed Kalman Inversion for Gradient-Free Inference in Bayesian Inverse Problems

    Full text link
    For many scientific inverse problems we are required to evaluate an expensive forward model. Moreover, the model is often given in such a form that it is unrealistic to access its gradients. In such a scenario, standard Markov Chain Monte Carlo algorithms quickly become impractical, requiring a large number of serial model evaluations to converge on the target distribution. In this paper we introduce Flow Annealed Kalman Inversion (FAKI). This is a generalization of Ensemble Kalman Inversion (EKI), where we embed the Kalman filter updates in a temperature annealing scheme, and use normalizing flows (NF) to map the intermediate measures corresponding to each temperature level to the standard Gaussian. In doing so, we relax the Gaussian ansatz for the intermediate measures used in standard EKI, allowing us to achieve higher fidelity approximations to non-Gaussian targets. We demonstrate the performance of FAKI on two numerical benchmarks, showing dramatic improvements over standard EKI in terms of accuracy whilst accelerating its already rapid convergence properties (typically in O(10)\mathcal{O}(10) steps).Comment: 9 pages, 2 figues. Presented at MaxEnt 2023. Modified version to appear in MaxEnt 2023 proceeding

    Inference via low-dimensional couplings

    Full text link
    We investigate the low-dimensional structure of deterministic transformations between random variables, i.e., transport maps between probability measures. In the context of statistics and machine learning, these transformations can be used to couple a tractable "reference" measure (e.g., a standard Gaussian) with a target measure of interest. Direct simulation from the desired measure can then be achieved by pushing forward reference samples through the map. Yet characterizing such a map---e.g., representing and evaluating it---grows challenging in high dimensions. The central contribution of this paper is to establish a link between the Markov properties of the target measure and the existence of low-dimensional couplings, induced by transport maps that are sparse and/or decomposable. Our analysis not only facilitates the construction of transformations in high-dimensional settings, but also suggests new inference methodologies for continuous non-Gaussian graphical models. For instance, in the context of nonlinear state-space models, we describe new variational algorithms for filtering, smoothing, and sequential parameter inference. These algorithms can be understood as the natural generalization---to the non-Gaussian case---of the square-root Rauch-Tung-Striebel Gaussian smoother.Comment: 78 pages, 25 figure

    Data Assimilation and Inverse Problems

    Get PDF
    These notes are designed with the aim of providing a clear and concise introduction to the subjects of Inverse Problems and Data Assimilation, and their inter-relations, together with citations to some relevant literature in this area. The first half of the notes is dedicated to studying the Bayesian framework for inverse problems. Techniques such as importance sampling and Markov Chain Monte Carlo (MCMC) methods are introduced; these methods have the desirable property that in the limit of an infinite number of samples they reproduce the full posterior distribution. Since it is often computationally intensive to implement these methods, especially in high dimensional problems, approximate techniques such as approximating the posterior by a Dirac or a Gaussian distribution are discussed. The second half of the notes cover data assimilation. This refers to a particular class of inverse problems in which the unknown parameter is the initial condition of a dynamical system, and in the stochastic dynamics case the subsequent states of the system, and the data comprises partial and noisy observations of that (possibly stochastic) dynamical system. We will also demonstrate that methods developed in data assimilation may be employed to study generic inverse problems, by introducing an artificial time to generate a sequence of probability measures interpolating from the prior to the posterior
    corecore