836 research outputs found
Randomized Riemannian Preconditioning for Orthogonality Constrained Problems
Optimization problems with (generalized) orthogonality constraints are
prevalent across science and engineering. For example, in computational science
they arise in the symmetric (generalized) eigenvalue problem, in nonlinear
eigenvalue problems, and in electronic structures computations, to name a few
problems. In statistics and machine learning, they arise, for example, in
canonical correlation analysis and in linear discriminant analysis. In this
article, we consider using randomized preconditioning in the context of
optimization problems with generalized orthogonality constraints. Our proposed
algorithms are based on Riemannian optimization on the generalized Stiefel
manifold equipped with a non-standard preconditioned geometry, which
necessitates development of the geometric components necessary for developing
algorithms based on this approach. Furthermore, we perform asymptotic
convergence analysis of the preconditioned algorithms which help to
characterize the quality of a given preconditioner using second-order
information. Finally, for the problems of canonical correlation analysis and
linear discriminant analysis, we develop randomized preconditioners along with
corresponding bounds on the relevant condition number
On the Convergence of the Laplace Approximation and Noise-Level-Robustness of Laplace-based Monte Carlo Methods for Bayesian Inverse Problems
The Bayesian approach to inverse problems provides a rigorous framework for
the incorporation and quantification of uncertainties in measurements,
parameters and models. We are interested in designing numerical methods which
are robust w.r.t. the size of the observational noise, i.e., methods which
behave well in case of concentrated posterior measures. The concentration of
the posterior is a highly desirable situation in practice, since it relates to
informative or large data. However, it can pose a computational challenge for
numerical methods based on the prior or reference measure. We propose to employ
the Laplace approximation of the posterior as the base measure for numerical
integration in this context. The Laplace approximation is a Gaussian measure
centered at the maximum a-posteriori estimate and with covariance matrix
depending on the logposterior density. We discuss convergence results of the
Laplace approximation in terms of the Hellinger distance and analyze the
efficiency of Monte Carlo methods based on it. In particular, we show that
Laplace-based importance sampling and Laplace-based quasi-Monte-Carlo methods
are robust w.r.t. the concentration of the posterior for large classes of
posterior distributions and integrands whereas prior-based importance sampling
and plain quasi-Monte Carlo are not. Numerical experiments are presented to
illustrate the theoretical findings.Comment: 50 pages, 11 figure
Conjugate-Gradient Preconditioning Methods for Shift-Variant PET Image Reconstruction
Gradient-based iterative methods often converge slowly for tomographic image reconstruction and image restoration problems, but can be accelerated by suitable preconditioners. Diagonal preconditioners offer some improvement in convergence rate, but do not incorporate the structure of the Hessian matrices in imaging problems. Circulant preconditioners can provide remarkable acceleration for inverse problems that are approximately shift-invariant, i.e., for those with approximately block-Toeplitz or block-circulant Hessians. However, in applications with nonuniform noise variance, such as arises from Poisson statistics in emission tomography and in quantum-limited optical imaging, the Hessian of the weighted least-squares objective function is quite shift-variant, and circulant preconditioners perform poorly. Additional shift-variance is caused by edge-preserving regularization methods based on nonquadratic penalty functions. This paper describes new preconditioners that approximate more accurately the Hessian matrices of shift-variant imaging problems. Compared to diagonal or circulant preconditioning, the new preconditioners lead to significantly faster convergence rates for the unconstrained conjugate-gradient (CG) iteration. We also propose a new efficient method for the line-search step required by CG methods. Applications to positron emission tomography (PET) illustrate the method.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/85979/1/Fessler85.pd
Optimal low-rank approximations of Bayesian linear inverse problems
In the Bayesian approach to inverse problems, data are often informative,
relative to the prior, only on a low-dimensional subspace of the parameter
space. Significant computational savings can be achieved by using this subspace
to characterize and approximate the posterior distribution of the parameters.
We first investigate approximation of the posterior covariance matrix as a
low-rank update of the prior covariance matrix. We prove optimality of a
particular update, based on the leading eigendirections of the matrix pencil
defined by the Hessian of the negative log-likelihood and the prior precision,
for a broad class of loss functions. This class includes the F\"{o}rstner
metric for symmetric positive definite matrices, as well as the
Kullback-Leibler divergence and the Hellinger distance between the associated
distributions. We also propose two fast approximations of the posterior mean
and prove their optimality with respect to a weighted Bayes risk under
squared-error loss. These approximations are deployed in an offline-online
manner, where a more costly but data-independent offline calculation is
followed by fast online evaluations. As a result, these approximations are
particularly useful when repeated posterior mean evaluations are required for
multiple data sets. We demonstrate our theoretical results with several
numerical examples, including high-dimensional X-ray tomography and an inverse
heat conduction problem. In both of these examples, the intrinsic
low-dimensional structure of the inference problem can be exploited while
producing results that are essentially indistinguishable from solutions
computed in the full space
- …