Search CORE

3,354 research outputs found

Robust Bayesian inference via coarsening

Author: Dunson David B.
Miller Jeffrey W.
Publication venue
Publication date: 19/06/2015
Field of study

The standard approach to Bayesian inference is based on the assumption that the distribution of the data belongs to the chosen model class. However, even a small violation of this assumption can have a large impact on the outcome of a Bayesian procedure. We introduce a simple, coherent approach to Bayesian inference that improves robustness to perturbations from the model: rather than condition on the data exactly, one conditions on a neighborhood of the empirical distribution. When using neighborhoods based on relative entropy estimates, the resulting "coarsened" posterior can be approximated by simply tempering the likelihood---that is, by raising it to a fractional power---thus, inference is often easily implemented with standard methods, and one can even obtain analytical solutions when using conjugate priors. Some theoretical properties are derived, and we illustrate the approach with real and simulated data, using mixture models, autoregressive models of unknown order, and variable selection in linear regression

arXiv.org e-Print Archive

CiteSeerX

Predictability, complexity and learning

Author: Bialek William
Nemenman Ilya
Tishby Naftali
Publication venue
Publication date: 01/01/2001
Field of study

We define {\em predictive information}

I_{\rm pred} (T)

as the mutual information between the past and the future of a time series. Three qualitatively different behaviors are found in the limit of large observation times

T

I_{\rm pred} (T)

can remain finite, grow logarithmically, or grow as a fractional power law. If the time series allows us to learn a model with a finite number of parameters, then

I_{\rm pred} (T)

grows logarithmically with a coefficient that counts the dimensionality of the model space. In contrast, power--law growth is associated, for example, with the learning of infinite parameter (or nonparametric) models such as continuous functions with smoothness constraints. There are connections between the predictive information and measures of complexity that have been defined both in learning theory and in the analysis of physical systems through statistical mechanics and dynamical systems theory. Further, in the same way that entropy provides the unique measure of available information consistent with some simple and plausible conditions, we argue that the divergent part of

I_{\rm pred} (T)

provides the unique measure for the complexity of dynamics underlying a time series. Finally, we discuss how these ideas may be useful in different problems in physics, statistics, and biology.Comment: 53 pages, 3 figures, 98 references, LaTeX2

arXiv.org e-Print Archive

CiteSeerX

Volatility forecasting

Author: Andersen Torben G.
Bollerslev Tim
Christoffersen Peter F.
Diebold Francis X.
Publication venue
Publication date: 01/01/2005
Field of study

Volatility has been one of the most active and successful areas of research in time series econometrics and economic forecasting in recent decades. This chapter provides a selective survey of the most important theoretical developments and empirical insights to emerge from this burgeoning literature, with a distinct focus on forecasting applications. Volatility is inherently latent, and Section 1 begins with a brief intuitive account of various key volatility concepts. Section 2 then discusses a series of different economic situations in which volatility plays a crucial role, ranging from the use of volatility forecasts in portfolio allocation to density forecasting in risk management. Sections 3, 4 and 5 present a variety of alternative procedures for univariate volatility modeling and forecasting based on the GARCH, stochastic volatility and realized volatility paradigms, respectively. Section 6 extends the discussion to the multivariate problem of forecasting conditional covariances and correlations, and Section 7 discusses volatility forecast evaluation methods in both univariate and multivariate cases. Section 8 concludes briefly. JEL Klassifikation: C10, C53, G1

CiteSeerX

Hochschulschriftenserver - Universität Frankfurt am Main

Forecasting using relative entropy

Author: Charles H. Whiteman
Ellis W. Tallman
John C. Robertson
Publication venue
Publication date
Field of study

The paper describes a relative entropy procedure for imposing moment restrictions on simulated forecast distributions from a variety of models. Starting from an empirical forecast distribution for some variables of interest, the technique generates a new empirical distribution that satisfies a set of moment restrictions. The new distribution is chosen to be as close as possible to the original in the sense of minimizing the associated Kullback-Leibler Information Criterion, or relative entropy. The authors illustrate the technique by using several examples that show how restrictions from other forecasts and from economic theory may be introduced into a model's forecasts.Forecasting

Research Papers in Economics

Optimal cross-validation in density estimation with the $L^2$ -loss

Author: Celisse Alain
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/10/2014
Field of study

We analyze the performance of cross-validation (CV) in the density estimation framework with two purposes: (i) risk estimation and (ii) model selection. The main focus is given to the so-called leave-

p

-out CV procedure (Lpo), where

p

denotes the cardinality of the test set. Closed-form expressions are settled for the Lpo estimator of the risk of projection estimators. These expressions provide a great improvement upon

V

-fold cross-validation in terms of variability and computational complexity. From a theoretical point of view, closed-form expressions also enable to study the Lpo performance in terms of risk estimation. The optimality of leave-one-out (Loo), that is Lpo with

p=1

, is proved among CV procedures used for risk estimation. Two model selection frameworks are also considered: estimation, as opposed to identification. For estimation with finite sample size

n

, optimality is achieved for

p

large enough [with

p/n=o(1)

] to balance the overfitting resulting from the structure of the model collection. For identification, model selection consistency is settled for Lpo as long as

p/n

is conveniently related to the rate of convergence of the best estimator in the collection: (i)

p/n\to1

n\to+\infty

with a parametric rate, and (ii)

p/n=o(1)

with some nonparametric estimators. These theoretical results are validated by simulation experiments.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1240 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref