19,167 research outputs found

    Forecasting under model uncertainty:Non-homogeneous hidden Markov models with Polya-Gamma data augmentation

    Full text link
    We consider two-state Non-Homogeneous Hidden Markov Models (NHHMMs) for forecasting univariate time series. Given a set of predictors, the time series are modeled via predictive regressions with state dependent coefficients and time-varying transition probabilities that depend on the predictors via a logistic function. In a hidden Markov setting, inference for logistic regression coefficients becomes complicated and in some cases impossible due to convergence issues. In this paper, we aim to address this problem using a new latent variable scheme that utilizes the P\'{o}lya-Gamma class of distributions. We allow for model uncertainty regarding the predictors that affect the series both linearly -- in the mean -- and non-linearly -- in the transition matrix. Predictor selection and inference on the model parameters are based on a MCMC scheme with reversible jump steps. Single-step and multiple-steps-ahead predictions are obtained by the most probable model, median probability model or a Bayesian Model Averaging approach. Using simulation experiments, we illustrate the performance of our algorithm in various setups, in terms of mixing properties, model selection and predictive ability. An empirical study on realized volatility data shows that our methodology gives improved forecasts compared to benchmark models.Comment: 36 pages, 5 figures, 6 table

    An optimal first order method based on optimal quadratic averaging

    Full text link
    In a recent paper, Bubeck, Lee, and Singh introduced a new first order method for minimizing smooth strongly convex functions. Their geometric descent algorithm, largely inspired by the ellipsoid method, enjoys the optimal linear rate of convergence. We show that the same iterate sequence is generated by a scheme that in each iteration computes an optimal average of quadratic lower-models of the function. Indeed, the minimum of the averaged quadratic approaches the true minimum at an optimal rate. This intuitive viewpoint reveals clear connections to the original fast-gradient methods and cutting plane ideas, and leads to limited-memory extensions with improved performance.Comment: 23 page

    Probabilistic Quantitative Precipitation Forecasting Using Ensemble Model Output Statistics

    Full text link
    Statistical post-processing of dynamical forecast ensembles is an essential component of weather forecasting. In this article, we present a post-processing method that generates full predictive probability distributions for precipitation accumulations based on ensemble model output statistics (EMOS). We model precipitation amounts by a generalized extreme value distribution that is left-censored at zero. This distribution permits modelling precipitation on the original scale without prior transformation of the data. A closed form expression for its continuous rank probability score can be derived and permits computationally efficient model fitting. We discuss an extension of our approach that incorporates further statistics characterizing the spatial variability of precipitation amounts in the vicinity of the location of interest. The proposed EMOS method is applied to daily 18-h forecasts of 6-h accumulated precipitation over Germany in 2011 using the COSMO-DE ensemble prediction system operated by the German Meteorological Service. It yields calibrated and sharp predictive distributions and compares favourably with extended logistic regression and Bayesian model averaging which are state of the art approaches for precipitation post-processing. The incorporation of neighbourhood information further improves predictive performance and turns out to be a useful strategy to account for displacement errors of the dynamical forecasts in a probabilistic forecasting framework

    Distributed Differentially Private Computation of Functions with Correlated Noise

    Full text link
    Many applications of machine learning, such as human health research, involve processing private or sensitive information. Privacy concerns may impose significant hurdles to collaboration in scenarios where there are multiple sites holding data and the goal is to estimate properties jointly across all datasets. Differentially private decentralized algorithms can provide strong privacy guarantees. However, the accuracy of the joint estimates may be poor when the datasets at each site are small. This paper proposes a new framework, Correlation Assisted Private Estimation (CAPE), for designing privacy-preserving decentralized algorithms with better accuracy guarantees in an honest-but-curious model. CAPE can be used in conjunction with the functional mechanism for statistical and machine learning optimization problems. A tighter characterization of the functional mechanism is provided that allows CAPE to achieve the same performance as a centralized algorithm in the decentralized setting using all datasets. Empirical results on regression and neural network problems for both synthetic and real datasets show that differentially private methods can be competitive with non-private algorithms in many scenarios of interest.Comment: The manuscript is partially subsumed by arXiv:1910.1291

    The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo

    Full text link
    Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo (MCMC) algorithm that avoids the random walk behavior and sensitivity to correlated parameters that plague many MCMC methods by taking a series of steps informed by first-order gradient information. These features allow it to converge to high-dimensional target distributions much more quickly than simpler methods such as random walk Metropolis or Gibbs sampling. However, HMC's performance is highly sensitive to two user-specified parameters: a step size {\epsilon} and a desired number of steps L. In particular, if L is too small then the algorithm exhibits undesirable random walk behavior, while if L is too large the algorithm wastes computation. We introduce the No-U-Turn Sampler (NUTS), an extension to HMC that eliminates the need to set a number of steps L. NUTS uses a recursive algorithm to build a set of likely candidate points that spans a wide swath of the target distribution, stopping automatically when it starts to double back and retrace its steps. Empirically, NUTS perform at least as efficiently as and sometimes more efficiently than a well tuned standard HMC method, without requiring user intervention or costly tuning runs. We also derive a method for adapting the step size parameter {\epsilon} on the fly based on primal-dual averaging. NUTS can thus be used with no hand-tuning at all. NUTS is also suitable for applications such as BUGS-style automatic inference engines that require efficient "turnkey" sampling algorithms.Comment: 30 pages, 7 figure

    Parallel SGD: When does averaging help?

    Full text link
    Consider a number of workers running SGD independently on the same pool of data and averaging the models every once in a while -- a common but not well understood practice. We study model averaging as a variance-reducing mechanism and describe two ways in which the frequency of averaging affects convergence. For convex objectives, we show the benefit of frequent averaging depends on the gradient variance envelope. For non-convex objectives, we illustrate that this benefit depends on the presence of multiple globally optimal points. We complement our findings with multicore experiments on both synthetic and real data

    Uncertainty Quantification in Complex Simulation Models Using Ensemble Copula Coupling

    Full text link
    Critical decisions frequently rely on high-dimensional output from complex computer simulation models that show intricate cross-variable, spatial and temporal dependence structures, with weather and climate predictions being key examples. There is a strongly increasing recognition of the need for uncertainty quantification in such settings, for which we propose and review a general multi-stage procedure called ensemble copula coupling (ECC), proceeding as follows: 1. Generate a raw ensemble, consisting of multiple runs of the computer model that differ in the inputs or model parameters in suitable ways. 2. Apply statistical postprocessing techniques, such as Bayesian model averaging or nonhomogeneous regression, to correct for systematic errors in the raw ensemble, to obtain calibrated and sharp predictive distributions for each univariate output variable individually. 3. Draw a sample from each postprocessed predictive distribution. 4. Rearrange the sampled values in the rank order structure of the raw ensemble to obtain the ECC postprocessed ensemble. The use of ensembles and statistical postprocessing have become routine in weather forecasting over the past decade. We show that seemingly unrelated, recent advances can be interpreted, fused and consolidated within the framework of ECC, the common thread being the adoption of the empirical copula of the raw ensemble. Depending on the use of Quantiles, Random draws or Transformations at the sampling stage, we distinguish the ECC-Q, ECC-R and ECC-T variants, respectively. We also describe relations to the Schaake shuffle and extant copula-based techniques. In a case study, the ECC approach is applied to predictions of temperature, pressure, precipitation and wind over Germany, based on the 50-member European Centre for Medium-Range Weather Forecasts (ECMWF) ensemble.Comment: Published in at http://dx.doi.org/10.1214/13-STS443 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Probabilistic wind speed forecasting on a grid based on ensemble model output statistics

    Full text link
    Probabilistic forecasts of wind speed are important for a wide range of applications, ranging from operational decision making in connection with wind power generation to storm warnings, ship routing and aviation. We present a statistical method that provides locally calibrated, probabilistic wind speed forecasts at any desired place within the forecast domain based on the output of a numerical weather prediction (NWP) model. Three approaches for wind speed post-processing are proposed, which use either truncated normal, gamma or truncated logistic distributions to make probabilistic predictions about future observations conditional on the forecasts of an ensemble prediction system (EPS). In order to provide probabilistic forecasts on a grid, predictive distributions that were calibrated with local wind speed observations need to be interpolated. We study several interpolation schemes that combine geostatistical methods with local information on annual mean wind speeds, and evaluate the proposed methodology with surface wind speed forecasts over Germany from the COSMO-DE (Consortium for Small-scale Modelling) ensemble prediction system.Comment: Published at http://dx.doi.org/10.1214/15-AOAS843 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Stochastic L-BFGS: Improved Convergence Rates and Practical Acceleration Strategies

    Full text link
    We revisit the stochastic limited-memory BFGS (L-BFGS) algorithm. By proposing a new framework for the convergence analysis, we prove improved convergence rates and computational complexities of the stochastic L-BFGS algorithms compared to previous works. In addition, we propose several practical acceleration strategies to speed up the empirical performance of such algorithms. We also provide theoretical analyses for most of the strategies. Experiments on large-scale logistic and ridge regression problems demonstrate that our proposed strategies yield significant improvements vis-\`a-vis competing state-of-the-art algorithms

    Using deterministic approximations to accelerate SMC for posterior sampling

    Full text link
    Sequential Monte Carlo has become a standard tool for Bayesian Inference of complex models. This approach can be computationally demanding, especially when initialized from the prior distribution. On the other hand, deter-ministic approximations of the posterior distribution are often available with no theoretical guaranties. We propose a bridge sampling scheme starting from such a deterministic approximation of the posterior distribution and targeting the true one. The resulting Shortened Bridge Sampler (SBS) relies on a sequence of distributions that is determined in an adaptive way. We illustrate the robustness and the efficiency of the methodology on a large simulation study. When applied to network datasets, SBS inference leads to different statistical conclusions from the one supplied by the standard variational Bayes approximation
    • …
    corecore