45,206 research outputs found

    Optimal low-rank approximations of Bayesian linear inverse problems

    Full text link
    In the Bayesian approach to inverse problems, data are often informative, relative to the prior, only on a low-dimensional subspace of the parameter space. Significant computational savings can be achieved by using this subspace to characterize and approximate the posterior distribution of the parameters. We first investigate approximation of the posterior covariance matrix as a low-rank update of the prior covariance matrix. We prove optimality of a particular update, based on the leading eigendirections of the matrix pencil defined by the Hessian of the negative log-likelihood and the prior precision, for a broad class of loss functions. This class includes the F\"{o}rstner metric for symmetric positive definite matrices, as well as the Kullback-Leibler divergence and the Hellinger distance between the associated distributions. We also propose two fast approximations of the posterior mean and prove their optimality with respect to a weighted Bayes risk under squared-error loss. These approximations are deployed in an offline-online manner, where a more costly but data-independent offline calculation is followed by fast online evaluations. As a result, these approximations are particularly useful when repeated posterior mean evaluations are required for multiple data sets. We demonstrate our theoretical results with several numerical examples, including high-dimensional X-ray tomography and an inverse heat conduction problem. In both of these examples, the intrinsic low-dimensional structure of the inference problem can be exploited while producing results that are essentially indistinguishable from solutions computed in the full space

    Bayesian forecasting and scalable multivariate volatility analysis using simultaneous graphical dynamic models

    Full text link
    The recently introduced class of simultaneous graphical dynamic linear models (SGDLMs) defines an ability to scale on-line Bayesian analysis and forecasting to higher-dimensional time series. This paper advances the methodology of SGDLMs, developing and embedding a novel, adaptive method of simultaneous predictor selection in forward filtering for on-line learning and forecasting. The advances include developments in Bayesian computation for scalability, and a case study in exploring the resulting potential for improved short-term forecasting of large-scale volatility matrices. A case study concerns financial forecasting and portfolio optimization with a 400-dimensional series of daily stock prices. Analysis shows that the SGDLM forecasts volatilities and co-volatilities well, making it ideally suited to contributing to quantitative investment strategies to improve portfolio returns. We also identify performance metrics linked to the sequential Bayesian filtering analysis that turn out to define a leading indicator of increased financial market stresses, comparable to but leading the standard St. Louis Fed Financial Stress Index (STLFSI) measure. Parallel computation using GPU implementations substantially advance the ability to fit and use these models.Comment: 28 pages, 9 figures, 7 table

    On the Regularizing Property of Stochastic Gradient Descent

    Get PDF
    Stochastic gradient descent is one of the most successful approaches for solving large-scale problems, especially in machine learning and statistics. At each iteration, it employs an unbiased estimator of the full gradient computed from one single randomly selected data point. Hence, it scales well with problem size and is very attractive for truly massive dataset, and holds significant potentials for solving large-scale inverse problems. In the recent literature of machine learning, it was empirically observed that when equipped with early stopping, it has regularizing property. In this work, we rigorously establish its regularizing property (under \textit{a priori} early stopping rule), and also prove convergence rates under the canonical sourcewise condition, for minimizing the quadratic functional for linear inverse problems. This is achieved by combining tools from classical regularization theory and stochastic analysis. Further, we analyze the preasymptotic weak and strong convergence behavior of the algorithm. The theoretical findings shed insights into the performance of the algorithm, and are complemented with illustrative numerical experiments.Comment: 22 pages, better presentatio

    Optimization Methods for Inverse Problems

    Full text link
    Optimization plays an important role in solving many inverse problems. Indeed, the task of inversion often either involves or is fully cast as a solution of an optimization problem. In this light, the mere non-linear, non-convex, and large-scale nature of many of these inversions gives rise to some very challenging optimization problems. The inverse problem community has long been developing various techniques for solving such optimization tasks. However, other, seemingly disjoint communities, such as that of machine learning, have developed, almost in parallel, interesting alternative methods which might have stayed under the radar of the inverse problem community. In this survey, we aim to change that. In doing so, we first discuss current state-of-the-art optimization methods widely used in inverse problems. We then survey recent related advances in addressing similar challenges in problems faced by the machine learning community, and discuss their potential advantages for solving inverse problems. By highlighting the similarities among the optimization challenges faced by the inverse problem and the machine learning communities, we hope that this survey can serve as a bridge in bringing together these two communities and encourage cross fertilization of ideas.Comment: 13 page

    Nonlinear Attitude Filtering: A Comparison Study

    Get PDF
    This paper contains a concise comparison of a number of nonlinear attitude filtering methods that have attracted attention in the robotics and aviation literature. With the help of previously published surveys and comparison studies, the vast literature on the subject is narrowed down to a small pool of competitive attitude filters. Amongst these filters is a second-order optimal minimum-energy filter recently proposed by the authors. Easily comparable discretized unit quaternion implementations of the selected filters are provided. We conduct a simulation study and compare the transient behaviour and asymptotic convergence of these filters in two scenarios with different initialization and measurement errors inspired by applications in unmanned aerial robotics and space flight. The second-order optimal minimum-energy filter is shown to have the best performance of all filters, including the industry standard multiplicative extended Kalman filter (MEKF)

    Parameter estimation by implicit sampling

    Full text link
    Implicit sampling is a weighted sampling method that is used in data assimilation, where one sequentially updates estimates of the state of a stochastic model based on a stream of noisy or incomplete data. Here we describe how to use implicit sampling in parameter estimation problems, where the goal is to find parameters of a numerical model, e.g.~a partial differential equation (PDE), such that the output of the numerical model is compatible with (noisy) data. We use the Bayesian approach to parameter estimation, in which a posterior probability density describes the probability of the parameter conditioned on data and compute an empirical estimate of this posterior with implicit sampling. Our approach generates independent samples, so that some of the practical difficulties one encounters with Markov Chain Monte Carlo methods, e.g.~burn-in time or correlations among dependent samples, are avoided. We describe a new implementation of implicit sampling for parameter estimation problems that makes use of multiple grids (coarse to fine) and BFGS optimization coupled to adjoint equations for the required gradient calculations. The implementation is "dimension independent", in the sense that a well-defined finite dimensional subspace is sampled as the mesh used for discretization of the PDE is refined. We illustrate the algorithm with an example where we estimate a diffusion coefficient in an elliptic equation from sparse and noisy pressure measurements. In the example, dimension\slash mesh-independence is achieved via Karhunen-Lo\`{e}ve expansions

    Finite Density Algorithm in Lattice QCD -- a Canonical Ensemble Approach

    Get PDF
    I will review the finite density algorithm for lattice QCD based on finite chemical potential and summarize the associated difficulties. I will propose a canonical ensemble approach which projects out the finite baryon number sector from the fermion determinant. For this algorithm to work, it requires an efficient method for calculating the fermion determinant and a Monte Carlo algorithm which accommodates unbiased estimate of the probability. I shall report on the progress made along this direction with the Pad\'{e} - Z2_2 estimator of the determinant and its implementation in the newly developed Noisy Monte Carlo algorithm.Comment: Invited talk at Nankai Symposium on Mathematical Physics, Tianjin, Oct. 2001, 18 pages, 3 figures; expanded and references adde
    • …
    corecore