836 research outputs found

    Randomized Riemannian Preconditioning for Orthogonality Constrained Problems

    Get PDF
    Optimization problems with (generalized) orthogonality constraints are prevalent across science and engineering. For example, in computational science they arise in the symmetric (generalized) eigenvalue problem, in nonlinear eigenvalue problems, and in electronic structures computations, to name a few problems. In statistics and machine learning, they arise, for example, in canonical correlation analysis and in linear discriminant analysis. In this article, we consider using randomized preconditioning in the context of optimization problems with generalized orthogonality constraints. Our proposed algorithms are based on Riemannian optimization on the generalized Stiefel manifold equipped with a non-standard preconditioned geometry, which necessitates development of the geometric components necessary for developing algorithms based on this approach. Furthermore, we perform asymptotic convergence analysis of the preconditioned algorithms which help to characterize the quality of a given preconditioner using second-order information. Finally, for the problems of canonical correlation analysis and linear discriminant analysis, we develop randomized preconditioners along with corresponding bounds on the relevant condition number

    On the Convergence of the Laplace Approximation and Noise-Level-Robustness of Laplace-based Monte Carlo Methods for Bayesian Inverse Problems

    Get PDF
    The Bayesian approach to inverse problems provides a rigorous framework for the incorporation and quantification of uncertainties in measurements, parameters and models. We are interested in designing numerical methods which are robust w.r.t. the size of the observational noise, i.e., methods which behave well in case of concentrated posterior measures. The concentration of the posterior is a highly desirable situation in practice, since it relates to informative or large data. However, it can pose a computational challenge for numerical methods based on the prior or reference measure. We propose to employ the Laplace approximation of the posterior as the base measure for numerical integration in this context. The Laplace approximation is a Gaussian measure centered at the maximum a-posteriori estimate and with covariance matrix depending on the logposterior density. We discuss convergence results of the Laplace approximation in terms of the Hellinger distance and analyze the efficiency of Monte Carlo methods based on it. In particular, we show that Laplace-based importance sampling and Laplace-based quasi-Monte-Carlo methods are robust w.r.t. the concentration of the posterior for large classes of posterior distributions and integrands whereas prior-based importance sampling and plain quasi-Monte Carlo are not. Numerical experiments are presented to illustrate the theoretical findings.Comment: 50 pages, 11 figure

    Conjugate-Gradient Preconditioning Methods for Shift-Variant PET Image Reconstruction

    Full text link
    Gradient-based iterative methods often converge slowly for tomographic image reconstruction and image restoration problems, but can be accelerated by suitable preconditioners. Diagonal preconditioners offer some improvement in convergence rate, but do not incorporate the structure of the Hessian matrices in imaging problems. Circulant preconditioners can provide remarkable acceleration for inverse problems that are approximately shift-invariant, i.e., for those with approximately block-Toeplitz or block-circulant Hessians. However, in applications with nonuniform noise variance, such as arises from Poisson statistics in emission tomography and in quantum-limited optical imaging, the Hessian of the weighted least-squares objective function is quite shift-variant, and circulant preconditioners perform poorly. Additional shift-variance is caused by edge-preserving regularization methods based on nonquadratic penalty functions. This paper describes new preconditioners that approximate more accurately the Hessian matrices of shift-variant imaging problems. Compared to diagonal or circulant preconditioning, the new preconditioners lead to significantly faster convergence rates for the unconstrained conjugate-gradient (CG) iteration. We also propose a new efficient method for the line-search step required by CG methods. Applications to positron emission tomography (PET) illustrate the method.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/85979/1/Fessler85.pd

    Optimal low-rank approximations of Bayesian linear inverse problems

    Full text link
    In the Bayesian approach to inverse problems, data are often informative, relative to the prior, only on a low-dimensional subspace of the parameter space. Significant computational savings can be achieved by using this subspace to characterize and approximate the posterior distribution of the parameters. We first investigate approximation of the posterior covariance matrix as a low-rank update of the prior covariance matrix. We prove optimality of a particular update, based on the leading eigendirections of the matrix pencil defined by the Hessian of the negative log-likelihood and the prior precision, for a broad class of loss functions. This class includes the F\"{o}rstner metric for symmetric positive definite matrices, as well as the Kullback-Leibler divergence and the Hellinger distance between the associated distributions. We also propose two fast approximations of the posterior mean and prove their optimality with respect to a weighted Bayes risk under squared-error loss. These approximations are deployed in an offline-online manner, where a more costly but data-independent offline calculation is followed by fast online evaluations. As a result, these approximations are particularly useful when repeated posterior mean evaluations are required for multiple data sets. We demonstrate our theoretical results with several numerical examples, including high-dimensional X-ray tomography and an inverse heat conduction problem. In both of these examples, the intrinsic low-dimensional structure of the inference problem can be exploited while producing results that are essentially indistinguishable from solutions computed in the full space
    corecore