45,206 research outputs found
Optimal low-rank approximations of Bayesian linear inverse problems
In the Bayesian approach to inverse problems, data are often informative,
relative to the prior, only on a low-dimensional subspace of the parameter
space. Significant computational savings can be achieved by using this subspace
to characterize and approximate the posterior distribution of the parameters.
We first investigate approximation of the posterior covariance matrix as a
low-rank update of the prior covariance matrix. We prove optimality of a
particular update, based on the leading eigendirections of the matrix pencil
defined by the Hessian of the negative log-likelihood and the prior precision,
for a broad class of loss functions. This class includes the F\"{o}rstner
metric for symmetric positive definite matrices, as well as the
Kullback-Leibler divergence and the Hellinger distance between the associated
distributions. We also propose two fast approximations of the posterior mean
and prove their optimality with respect to a weighted Bayes risk under
squared-error loss. These approximations are deployed in an offline-online
manner, where a more costly but data-independent offline calculation is
followed by fast online evaluations. As a result, these approximations are
particularly useful when repeated posterior mean evaluations are required for
multiple data sets. We demonstrate our theoretical results with several
numerical examples, including high-dimensional X-ray tomography and an inverse
heat conduction problem. In both of these examples, the intrinsic
low-dimensional structure of the inference problem can be exploited while
producing results that are essentially indistinguishable from solutions
computed in the full space
Bayesian forecasting and scalable multivariate volatility analysis using simultaneous graphical dynamic models
The recently introduced class of simultaneous graphical dynamic linear models
(SGDLMs) defines an ability to scale on-line Bayesian analysis and forecasting
to higher-dimensional time series. This paper advances the methodology of
SGDLMs, developing and embedding a novel, adaptive method of simultaneous
predictor selection in forward filtering for on-line learning and forecasting.
The advances include developments in Bayesian computation for scalability, and
a case study in exploring the resulting potential for improved short-term
forecasting of large-scale volatility matrices. A case study concerns financial
forecasting and portfolio optimization with a 400-dimensional series of daily
stock prices. Analysis shows that the SGDLM forecasts volatilities and
co-volatilities well, making it ideally suited to contributing to quantitative
investment strategies to improve portfolio returns. We also identify
performance metrics linked to the sequential Bayesian filtering analysis that
turn out to define a leading indicator of increased financial market stresses,
comparable to but leading the standard St. Louis Fed Financial Stress Index
(STLFSI) measure. Parallel computation using GPU implementations substantially
advance the ability to fit and use these models.Comment: 28 pages, 9 figures, 7 table
On the Regularizing Property of Stochastic Gradient Descent
Stochastic gradient descent is one of the most successful approaches for
solving large-scale problems, especially in machine learning and statistics. At
each iteration, it employs an unbiased estimator of the full gradient computed
from one single randomly selected data point. Hence, it scales well with
problem size and is very attractive for truly massive dataset, and holds
significant potentials for solving large-scale inverse problems. In the recent
literature of machine learning, it was empirically observed that when equipped
with early stopping, it has regularizing property. In this work, we rigorously
establish its regularizing property (under \textit{a priori} early stopping
rule), and also prove convergence rates under the canonical sourcewise
condition, for minimizing the quadratic functional for linear inverse problems.
This is achieved by combining tools from classical regularization theory and
stochastic analysis. Further, we analyze the preasymptotic weak and strong
convergence behavior of the algorithm. The theoretical findings shed insights
into the performance of the algorithm, and are complemented with illustrative
numerical experiments.Comment: 22 pages, better presentatio
Optimization Methods for Inverse Problems
Optimization plays an important role in solving many inverse problems.
Indeed, the task of inversion often either involves or is fully cast as a
solution of an optimization problem. In this light, the mere non-linear,
non-convex, and large-scale nature of many of these inversions gives rise to
some very challenging optimization problems. The inverse problem community has
long been developing various techniques for solving such optimization tasks.
However, other, seemingly disjoint communities, such as that of machine
learning, have developed, almost in parallel, interesting alternative methods
which might have stayed under the radar of the inverse problem community. In
this survey, we aim to change that. In doing so, we first discuss current
state-of-the-art optimization methods widely used in inverse problems. We then
survey recent related advances in addressing similar challenges in problems
faced by the machine learning community, and discuss their potential advantages
for solving inverse problems. By highlighting the similarities among the
optimization challenges faced by the inverse problem and the machine learning
communities, we hope that this survey can serve as a bridge in bringing
together these two communities and encourage cross fertilization of ideas.Comment: 13 page
Nonlinear Attitude Filtering: A Comparison Study
This paper contains a concise comparison of a number of nonlinear attitude
filtering methods that have attracted attention in the robotics and aviation
literature. With the help of previously published surveys and comparison
studies, the vast literature on the subject is narrowed down to a small pool of
competitive attitude filters. Amongst these filters is a second-order optimal
minimum-energy filter recently proposed by the authors. Easily comparable
discretized unit quaternion implementations of the selected filters are
provided. We conduct a simulation study and compare the transient behaviour and
asymptotic convergence of these filters in two scenarios with different
initialization and measurement errors inspired by applications in unmanned
aerial robotics and space flight. The second-order optimal minimum-energy
filter is shown to have the best performance of all filters, including the
industry standard multiplicative extended Kalman filter (MEKF)
Parameter estimation by implicit sampling
Implicit sampling is a weighted sampling method that is used in data
assimilation, where one sequentially updates estimates of the state of a
stochastic model based on a stream of noisy or incomplete data. Here we
describe how to use implicit sampling in parameter estimation problems, where
the goal is to find parameters of a numerical model, e.g.~a partial
differential equation (PDE), such that the output of the numerical model is
compatible with (noisy) data. We use the Bayesian approach to parameter
estimation, in which a posterior probability density describes the probability
of the parameter conditioned on data and compute an empirical estimate of this
posterior with implicit sampling. Our approach generates independent samples,
so that some of the practical difficulties one encounters with Markov Chain
Monte Carlo methods, e.g.~burn-in time or correlations among dependent samples,
are avoided. We describe a new implementation of implicit sampling for
parameter estimation problems that makes use of multiple grids (coarse to fine)
and BFGS optimization coupled to adjoint equations for the required gradient
calculations. The implementation is "dimension independent", in the sense that
a well-defined finite dimensional subspace is sampled as the mesh used for
discretization of the PDE is refined. We illustrate the algorithm with an
example where we estimate a diffusion coefficient in an elliptic equation from
sparse and noisy pressure measurements. In the example, dimension\slash
mesh-independence is achieved via Karhunen-Lo\`{e}ve expansions
Finite Density Algorithm in Lattice QCD -- a Canonical Ensemble Approach
I will review the finite density algorithm for lattice QCD based on finite
chemical potential and summarize the associated difficulties. I will propose a
canonical ensemble approach which projects out the finite baryon number sector
from the fermion determinant. For this algorithm to work, it requires an
efficient method for calculating the fermion determinant and a Monte Carlo
algorithm which accommodates unbiased estimate of the probability. I shall
report on the progress made along this direction with the Pad\'{e} - Z
estimator of the determinant and its implementation in the newly developed
Noisy Monte Carlo algorithm.Comment: Invited talk at Nankai Symposium on Mathematical Physics, Tianjin,
Oct. 2001, 18 pages, 3 figures; expanded and references adde
- …