21,618 research outputs found
Likelihood-informed dimension reduction for nonlinear inverse problems
The intrinsic dimensionality of an inverse problem is affected by prior
information, the accuracy and number of observations, and the smoothing
properties of the forward operator. From a Bayesian perspective, changes from
the prior to the posterior may, in many problems, be confined to a relatively
low-dimensional subspace of the parameter space. We present a dimension
reduction approach that defines and identifies such a subspace, called the
"likelihood-informed subspace" (LIS), by characterizing the relative influences
of the prior and the likelihood over the support of the posterior distribution.
This identification enables new and more efficient computational methods for
Bayesian inference with nonlinear forward models and Gaussian priors. In
particular, we approximate the posterior distribution as the product of a
lower-dimensional posterior defined on the LIS and the prior distribution
marginalized onto the complementary subspace. Markov chain Monte Carlo sampling
can then proceed in lower dimensions, with significant gains in computational
efficiency. We also introduce a Rao-Blackwellization strategy that
de-randomizes Monte Carlo estimates of posterior expectations for additional
variance reduction. We demonstrate the efficiency of our methods using two
numerical examples: inference of permeability in a groundwater system governed
by an elliptic PDE, and an atmospheric remote sensing problem based on Global
Ozone Monitoring System (GOMOS) observations
Optimal projection of observations in a Bayesian setting
Optimal dimensionality reduction methods are proposed for the Bayesian
inference of a Gaussian linear model with additive noise in presence of
overabundant data. Three different optimal projections of the observations are
proposed based on information theory: the projection that minimizes the
Kullback-Leibler divergence between the posterior distributions of the original
and the projected models, the one that minimizes the expected Kullback-Leibler
divergence between the same distributions, and the one that maximizes the
mutual information between the parameter of interest and the projected
observations. The first two optimization problems are formulated as the
determination of an optimal subspace and therefore the solution is computed
using Riemannian optimization algorithms on the Grassmann manifold. Regarding
the maximization of the mutual information, it is shown that there exists an
optimal subspace that minimizes the entropy of the posterior distribution of
the reduced model; a basis of the subspace can be computed as the solution to a
generalized eigenvalue problem; an a priori error estimate on the mutual
information is available for this particular solution; and that the
dimensionality of the subspace to exactly conserve the mutual information
between the input and the output of the models is less than the number of
parameters to be inferred. Numerical applications to linear and nonlinear
models are used to assess the efficiency of the proposed approaches, and to
highlight their advantages compared to standard approaches based on the
principal component analysis of the observations
TopSig: Topology Preserving Document Signatures
Performance comparisons between File Signatures and Inverted Files for text
retrieval have previously shown several significant shortcomings of file
signatures relative to inverted files. The inverted file approach underpins
most state-of-the-art search engine algorithms, such as Language and
Probabilistic models. It has been widely accepted that traditional file
signatures are inferior alternatives to inverted files. This paper describes
TopSig, a new approach to the construction of file signatures. Many advances in
semantic hashing and dimensionality reduction have been made in recent times,
but these were not so far linked to general purpose, signature file based,
search engines. This paper introduces a different signature file approach that
builds upon and extends these recent advances. We are able to demonstrate
significant improvements in the performance of signature file based indexing
and retrieval, performance that is comparable to that of state of the art
inverted file based systems, including Language models and BM25. These findings
suggest that file signatures offer a viable alternative to inverted files in
suitable settings and from the theoretical perspective it positions the file
signatures model in the class of Vector Space retrieval models.Comment: 12 pages, 8 figures, CIKM 201
- …