14,190 research outputs found

    Almost the Best of Three Worlds: Risk, Consistency and Optional Stopping for the Switch Criterion in Nested Model Selection

    Get PDF
    We study the switch distribution, introduced by Van Erven et al. (2012), applied to model selection and subsequent estimation. While switching was known to be strongly consistent, here we show that it achieves minimax optimal parametric risk rates up to a loglogn\log\log n factor when comparing two nested exponential families, partially confirming a conjecture by Lauritzen (2012) and Cavanaugh (2012) that switching behaves asymptotically like the Hannan-Quinn criterion. Moreover, like Bayes factor model selection but unlike standard significance testing, when one of the models represents a simple hypothesis, the switch criterion defines a robust null hypothesis test, meaning that its Type-I error probability can be bounded irrespective of the stopping rule. Hence, switching is consistent, insensitive to optional stopping and almost minimax risk optimal, showing that, Yang's (2005) impossibility result notwithstanding, it is possible to `almost' combine the strengths of AIC and Bayes factor model selection.Comment: To appear in Statistica Sinic

    Asymptotic Properties for Methods Combining Minimum Hellinger Distance Estimates and Bayesian Nonparametric Density Estimates

    Get PDF
    In frequentist inference, minimizing the Hellinger distance between a kernel density estimate and a parametric family produces estimators that are both robust to outliers and statistically efficienty when the parametric model is correct. This paper seeks to extend these results to the use of nonparametric Bayesian density estimators within disparity methods. We propose two estimators: one replaces the kernel density estimator with the expected posterior density from a random histogram prior; the other induces a posterior over parameters through the posterior for the random histogram. We show that it is possible to adapt the mathematical machinery of efficient influence functions from semiparametric models to demonstrate that both our estimators are efficient in the sense of achieving the Cramer-Rao lower bound. We further demonstrate a Bernstein-von-Mises result for our second estimator indicating that it's posterior is asymptotically Gaussian. In addition, the robustness properties of classical minimum Hellinger distance estimators continue to hold

    Lower Bounds on Exponential Moments of the Quadratic Error in Parameter Estimation

    Full text link
    Considering the problem of risk-sensitive parameter estimation, we propose a fairly wide family of lower bounds on the exponential moments of the quadratic error, both in the Bayesian and the non--Bayesian regime. This family of bounds, which is based on a change of measures, offers considerable freedom in the choice of the reference measure, and our efforts are devoted to explore this freedom to a certain extent. Our focus is mostly on signal models that are relevant to communication problems, namely, models of a parameter-dependent signal (modulated signal) corrupted by additive white Gaussian noise, but the methodology proposed is also applicable to other types of parametric families, such as models of linear systems driven by random input signals (white noise, in most cases), and others. In addition to the well known motivations of the risk-sensitive cost function (i.e., the exponential quadratic cost function), which is most notably, the robustness to model uncertainty, we also view this cost function as a tool for studying fundamental limits concerning the tail behavior of the estimation error. Another interesting aspect, that we demonstrate in a certain parametric model, is that the risk-sensitive cost function may be subjected to phase transitions, owing to some analogies with statistical mechanics.Comment: 28 pages; 4 figures; submitted for publicatio

    Prediction of time series by statistical learning: general losses and fast rates

    Full text link
    We establish rates of convergences in time series forecasting using the statistical learning approach based on oracle inequalities. A series of papers extends the oracle inequalities obtained for iid observations to time series under weak dependence conditions. Given a family of predictors and nn observations, oracle inequalities state that a predictor forecasts the series as well as the best predictor in the family up to a remainder term Δn\Delta_n. Using the PAC-Bayesian approach, we establish under weak dependence conditions oracle inequalities with optimal rates of convergence. We extend previous results for the absolute loss function to any Lipschitz loss function with rates Δnc(Θ)/n\Delta_n\sim\sqrt{c(\Theta)/ n} where c(Θ)c(\Theta) measures the complexity of the model. We apply the method for quantile loss functions to forecast the french GDP. Under additional conditions on the loss functions (satisfied by the quadratic loss function) and on the time series, we refine the rates of convergence to Δnc(Θ)/n\Delta_n \sim c(\Theta)/n. We achieve for the first time these fast rates for uniformly mixing processes. These rates are known to be optimal in the iid case and for individual sequences. In particular, we generalize the results of Dalalyan and Tsybakov on sparse regression estimation to the case of autoregression

    Bayesian Bootstrap Analysis of Systems of Equations

    Get PDF
    Research Methods/ Statistical Methods,

    Good, great, or lucky? Screening for firms with sustained superior performance using heavy-tailed priors

    Full text link
    This paper examines historical patterns of ROA (return on assets) for a cohort of 53,038 publicly traded firms across 93 countries, measured over the past 45 years. Our goal is to screen for firms whose ROA trajectories suggest that they have systematically outperformed their peer groups over time. Such a project faces at least three statistical difficulties: adjustment for relevant covariates, massive multiplicity, and longitudinal dependence. We conclude that, once these difficulties are taken into account, demonstrably superior performance appears to be quite rare. We compare our findings with other recent management studies on the same subject, and with the popular literature on corporate success. Our methodological contribution is to propose a new class of priors for use in large-scale simultaneous testing. These priors are based on the hypergeometric inverted-beta family, and have two main attractive features: heavy tails and computational tractability. The family is a four-parameter generalization of the normal/inverted-beta prior, and is the natural conjugate prior for shrinkage coefficients in a hierarchical normal model. Our results emphasize the usefulness of these heavy-tailed priors in large multiple-testing problems, as they have a mild rate of tail decay in the marginal likelihood m(y)m(y)---a property long recognized to be important in testing.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS512 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org
    corecore