21,618 research outputs found

    Likelihood-informed dimension reduction for nonlinear inverse problems

    Get PDF
    The intrinsic dimensionality of an inverse problem is affected by prior information, the accuracy and number of observations, and the smoothing properties of the forward operator. From a Bayesian perspective, changes from the prior to the posterior may, in many problems, be confined to a relatively low-dimensional subspace of the parameter space. We present a dimension reduction approach that defines and identifies such a subspace, called the "likelihood-informed subspace" (LIS), by characterizing the relative influences of the prior and the likelihood over the support of the posterior distribution. This identification enables new and more efficient computational methods for Bayesian inference with nonlinear forward models and Gaussian priors. In particular, we approximate the posterior distribution as the product of a lower-dimensional posterior defined on the LIS and the prior distribution marginalized onto the complementary subspace. Markov chain Monte Carlo sampling can then proceed in lower dimensions, with significant gains in computational efficiency. We also introduce a Rao-Blackwellization strategy that de-randomizes Monte Carlo estimates of posterior expectations for additional variance reduction. We demonstrate the efficiency of our methods using two numerical examples: inference of permeability in a groundwater system governed by an elliptic PDE, and an atmospheric remote sensing problem based on Global Ozone Monitoring System (GOMOS) observations

    Optimal projection of observations in a Bayesian setting

    Full text link
    Optimal dimensionality reduction methods are proposed for the Bayesian inference of a Gaussian linear model with additive noise in presence of overabundant data. Three different optimal projections of the observations are proposed based on information theory: the projection that minimizes the Kullback-Leibler divergence between the posterior distributions of the original and the projected models, the one that minimizes the expected Kullback-Leibler divergence between the same distributions, and the one that maximizes the mutual information between the parameter of interest and the projected observations. The first two optimization problems are formulated as the determination of an optimal subspace and therefore the solution is computed using Riemannian optimization algorithms on the Grassmann manifold. Regarding the maximization of the mutual information, it is shown that there exists an optimal subspace that minimizes the entropy of the posterior distribution of the reduced model; a basis of the subspace can be computed as the solution to a generalized eigenvalue problem; an a priori error estimate on the mutual information is available for this particular solution; and that the dimensionality of the subspace to exactly conserve the mutual information between the input and the output of the models is less than the number of parameters to be inferred. Numerical applications to linear and nonlinear models are used to assess the efficiency of the proposed approaches, and to highlight their advantages compared to standard approaches based on the principal component analysis of the observations

    TopSig: Topology Preserving Document Signatures

    Get PDF
    Performance comparisons between File Signatures and Inverted Files for text retrieval have previously shown several significant shortcomings of file signatures relative to inverted files. The inverted file approach underpins most state-of-the-art search engine algorithms, such as Language and Probabilistic models. It has been widely accepted that traditional file signatures are inferior alternatives to inverted files. This paper describes TopSig, a new approach to the construction of file signatures. Many advances in semantic hashing and dimensionality reduction have been made in recent times, but these were not so far linked to general purpose, signature file based, search engines. This paper introduces a different signature file approach that builds upon and extends these recent advances. We are able to demonstrate significant improvements in the performance of signature file based indexing and retrieval, performance that is comparable to that of state of the art inverted file based systems, including Language models and BM25. These findings suggest that file signatures offer a viable alternative to inverted files in suitable settings and from the theoretical perspective it positions the file signatures model in the class of Vector Space retrieval models.Comment: 12 pages, 8 figures, CIKM 201
    corecore