33,870 research outputs found
Barrier Frank-Wolfe for Marginal Inference
We introduce a globally-convergent algorithm for optimizing the
tree-reweighted (TRW) variational objective over the marginal polytope. The
algorithm is based on the conditional gradient method (Frank-Wolfe) and moves
pseudomarginals within the marginal polytope through repeated maximum a
posteriori (MAP) calls. This modular structure enables us to leverage black-box
MAP solvers (both exact and approximate) for variational inference, and obtains
more accurate results than tree-reweighted algorithms that optimize over the
local consistency relaxation. Theoretically, we bound the sub-optimality for
the proposed algorithm despite the TRW objective having unbounded gradients at
the boundary of the marginal polytope. Empirically, we demonstrate the
increased quality of results found by tightening the relaxation over the
marginal polytope as well as the spanning tree polytope on synthetic and
real-world instances.Comment: 25 pages, 12 figures, To appear in Neural Information Processing
Systems (NIPS) 2015, Corrected reference and cleaned up bibliograph
A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning
We present a tutorial on Bayesian optimization, a method of finding the
maximum of expensive cost functions. Bayesian optimization employs the Bayesian
technique of setting a prior over the objective function and combining it with
evidence to get a posterior function. This permits a utility-based selection of
the next observation to make on the objective function, which must take into
account both exploration (sampling from areas of high uncertainty) and
exploitation (sampling areas likely to offer improvement over the current best
observation). We also present two detailed extensions of Bayesian optimization,
with experiments---active user modelling with preferences, and hierarchical
reinforcement learning---and a discussion of the pros and cons of Bayesian
optimization based on our experiences
The OS* Algorithm: a Joint Approach to Exact Optimization and Sampling
Most current sampling algorithms for high-dimensional distributions are based
on MCMC techniques and are approximate in the sense that they are valid only
asymptotically. Rejection sampling, on the other hand, produces valid samples,
but is unrealistically slow in high-dimension spaces. The OS* algorithm that we
propose is a unified approach to exact optimization and sampling, based on
incremental refinements of a functional upper bound, which combines ideas of
adaptive rejection sampling and of A* optimization search. We show that the
choice of the refinement can be done in a way that ensures tractability in
high-dimension spaces, and we present first experiments in two different
settings: inference in high-order HMMs and in large discrete graphical models.Comment: 21 page
A Novel Framework for Online Amnesic Trajectory Compression in Resource-constrained Environments
State-of-the-art trajectory compression methods usually involve high
space-time complexity or yield unsatisfactory compression rates, leading to
rapid exhaustion of memory, computation, storage and energy resources. Their
ability is commonly limited when operating in a resource-constrained
environment especially when the data volume (even when compressed) far exceeds
the storage limit. Hence we propose a novel online framework for error-bounded
trajectory compression and ageing called the Amnesic Bounded Quadrant System
(ABQS), whose core is the Bounded Quadrant System (BQS) algorithm family that
includes a normal version (BQS), Fast version (FBQS), and a Progressive version
(PBQS). ABQS intelligently manages a given storage and compresses the
trajectories with different error tolerances subject to their ages. In the
experiments, we conduct comprehensive evaluations for the BQS algorithm family
and the ABQS framework. Using empirical GPS traces from flying foxes and cars,
and synthetic data from simulation, we demonstrate the effectiveness of the
standalone BQS algorithms in significantly reducing the time and space
complexity of trajectory compression, while greatly improving the compression
rates of the state-of-the-art algorithms (up to 45%). We also show that the
operational time of the target resource-constrained hardware platform can be
prolonged by up to 41%. We then verify that with ABQS, given data volumes that
are far greater than storage space, ABQS is able to achieve 15 to 400 times
smaller errors than the baselines. We also show that the algorithm is robust to
extreme trajectory shapes.Comment: arXiv admin note: substantial text overlap with arXiv:1412.032
Sparse Identification and Estimation of Large-Scale Vector AutoRegressive Moving Averages
The Vector AutoRegressive Moving Average (VARMA) model is fundamental to the
theory of multivariate time series; however, in practice, identifiability
issues have led many authors to abandon VARMA modeling in favor of the simpler
Vector AutoRegressive (VAR) model. Such a practice is unfortunate since even
very simple VARMA models can have quite complicated VAR representations. We
narrow this gap with a new optimization-based approach to VARMA identification
that is built upon the principle of parsimony. Among all equivalent
data-generating models, we seek the parameterization that is "simplest" in a
certain sense. A user-specified strongly convex penalty is used to measure
model simplicity, and that same penalty is then used to define an estimator
that can be efficiently computed. We show that our estimator converges to a
parsimonious element in the set of all equivalent data-generating models, in a
double asymptotic regime where the number of component time series is allowed
to grow with sample size. Further, we derive non-asymptotic upper bounds on the
estimation error of our method relative to our specially identified target.
Novel theoretical machinery includes non-asymptotic analysis of infinite-order
VAR, elastic net estimation under a singular covariance structure of
regressors, and new concentration inequalities for quadratic forms of random
variables from Gaussian time series. We illustrate the competitive performance
of our methods in simulation and several application domains, including
macro-economic forecasting, demand forecasting, and volatility forecasting
Forecasting and policy making
This chapter investigates the use of economic forecasting in policy making. Forecasts are used in many policy areas to project the consequences of particular policy measures for policymakersā targets. After reviewing some important forecasts of fiscal authorities and central banks, we proceed to focus on the role of forecasts in monetary policy. A formal framework serves to differentiate the role of forecasts in simple feedback rules versus optimal control policies. We then provide empirical evidence that central bank policies in the United States and the euro area are well described by interest rate rules responding to forecasts of inflation and economic activity rather than outcomes. Next, we provide a detailed exposition of methods for producing forecasts and the associated forecasting models. Practical applications with U.S. or euro area data are reported. Particular issues discussed include the use of economic structure in interpreting forecasts and the implementation of different conditioning assumptions regarding future policy that play a role in practice. We also compare the accuracy of model and expert forecasts and measure the degree of forecast heterogeneity. Finally, we utilize macroeconomic models to study the interaction of forecasting and policy by evaluating the performance and robustness of forecast versu
Action and behavior: a free-energy formulation
We have previously tried to explain perceptual inference and learning under a free-energy principle that pursues Helmholtzās agenda to understand the brain in terms of energy minimization. It is fairly easy to show that making inferences about the causes of sensory data can be cast as the minimization of a free-energy bound on the likelihood of sensory inputs, given an internal model of how they were caused. In this article, we consider what would happen if the data themselves were sampled to minimize this bound. It transpires that the ensuing active sampling or inference is mandated by ergodic arguments based on the very existence of adaptive agents. Furthermore, it accounts for many aspects of motor behavior; from retinal stabilization to goal-seeking. In particular, it suggests that motor control can be understood as fulfilling prior expectations about proprioceptive sensations. This formulation can explain why adaptive behavior emerges in biological agents and suggests a simple alternative to optimal control theory. We illustrate these points using simulations of oculomotor control and then apply to same principles to cued and goal-directed movements. In short, the free-energy formulation may provide an alternative perspective on the motor control that places it in an intimate relationship with perception
- ā¦