19,167 research outputs found
Forecasting under model uncertainty:Non-homogeneous hidden Markov models with Polya-Gamma data augmentation
We consider two-state Non-Homogeneous Hidden Markov Models (NHHMMs) for
forecasting univariate time series. Given a set of predictors, the time series
are modeled via predictive regressions with state dependent coefficients and
time-varying transition probabilities that depend on the predictors via a
logistic function. In a hidden Markov setting, inference for logistic
regression coefficients becomes complicated and in some cases impossible due to
convergence issues. In this paper, we aim to address this problem using a new
latent variable scheme that utilizes the P\'{o}lya-Gamma class of
distributions. We allow for model uncertainty regarding the predictors that
affect the series both linearly -- in the mean -- and non-linearly -- in the
transition matrix. Predictor selection and inference on the model parameters
are based on a MCMC scheme with reversible jump steps. Single-step and
multiple-steps-ahead predictions are obtained by the most probable model,
median probability model or a Bayesian Model Averaging approach. Using
simulation experiments, we illustrate the performance of our algorithm in
various setups, in terms of mixing properties, model selection and predictive
ability. An empirical study on realized volatility data shows that our
methodology gives improved forecasts compared to benchmark models.Comment: 36 pages, 5 figures, 6 table
An optimal first order method based on optimal quadratic averaging
In a recent paper, Bubeck, Lee, and Singh introduced a new first order method
for minimizing smooth strongly convex functions. Their geometric descent
algorithm, largely inspired by the ellipsoid method, enjoys the optimal linear
rate of convergence. We show that the same iterate sequence is generated by a
scheme that in each iteration computes an optimal average of quadratic
lower-models of the function. Indeed, the minimum of the averaged quadratic
approaches the true minimum at an optimal rate. This intuitive viewpoint
reveals clear connections to the original fast-gradient methods and cutting
plane ideas, and leads to limited-memory extensions with improved performance.Comment: 23 page
Probabilistic Quantitative Precipitation Forecasting Using Ensemble Model Output Statistics
Statistical post-processing of dynamical forecast ensembles is an essential
component of weather forecasting. In this article, we present a post-processing
method that generates full predictive probability distributions for
precipitation accumulations based on ensemble model output statistics (EMOS).
We model precipitation amounts by a generalized extreme value distribution that
is left-censored at zero. This distribution permits modelling precipitation on
the original scale without prior transformation of the data. A closed form
expression for its continuous rank probability score can be derived and permits
computationally efficient model fitting. We discuss an extension of our
approach that incorporates further statistics characterizing the spatial
variability of precipitation amounts in the vicinity of the location of
interest. The proposed EMOS method is applied to daily 18-h forecasts of 6-h
accumulated precipitation over Germany in 2011 using the COSMO-DE ensemble
prediction system operated by the German Meteorological Service. It yields
calibrated and sharp predictive distributions and compares favourably with
extended logistic regression and Bayesian model averaging which are state of
the art approaches for precipitation post-processing. The incorporation of
neighbourhood information further improves predictive performance and turns out
to be a useful strategy to account for displacement errors of the dynamical
forecasts in a probabilistic forecasting framework
Distributed Differentially Private Computation of Functions with Correlated Noise
Many applications of machine learning, such as human health research, involve
processing private or sensitive information. Privacy concerns may impose
significant hurdles to collaboration in scenarios where there are multiple
sites holding data and the goal is to estimate properties jointly across all
datasets. Differentially private decentralized algorithms can provide strong
privacy guarantees. However, the accuracy of the joint estimates may be poor
when the datasets at each site are small. This paper proposes a new framework,
Correlation Assisted Private Estimation (CAPE), for designing
privacy-preserving decentralized algorithms with better accuracy guarantees in
an honest-but-curious model. CAPE can be used in conjunction with the
functional mechanism for statistical and machine learning optimization
problems. A tighter characterization of the functional mechanism is provided
that allows CAPE to achieve the same performance as a centralized algorithm in
the decentralized setting using all datasets. Empirical results on regression
and neural network problems for both synthetic and real datasets show that
differentially private methods can be competitive with non-private algorithms
in many scenarios of interest.Comment: The manuscript is partially subsumed by arXiv:1910.1291
The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo
Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo (MCMC) algorithm
that avoids the random walk behavior and sensitivity to correlated parameters
that plague many MCMC methods by taking a series of steps informed by
first-order gradient information. These features allow it to converge to
high-dimensional target distributions much more quickly than simpler methods
such as random walk Metropolis or Gibbs sampling. However, HMC's performance is
highly sensitive to two user-specified parameters: a step size {\epsilon} and a
desired number of steps L. In particular, if L is too small then the algorithm
exhibits undesirable random walk behavior, while if L is too large the
algorithm wastes computation. We introduce the No-U-Turn Sampler (NUTS), an
extension to HMC that eliminates the need to set a number of steps L. NUTS uses
a recursive algorithm to build a set of likely candidate points that spans a
wide swath of the target distribution, stopping automatically when it starts to
double back and retrace its steps. Empirically, NUTS perform at least as
efficiently as and sometimes more efficiently than a well tuned standard HMC
method, without requiring user intervention or costly tuning runs. We also
derive a method for adapting the step size parameter {\epsilon} on the fly
based on primal-dual averaging. NUTS can thus be used with no hand-tuning at
all. NUTS is also suitable for applications such as BUGS-style automatic
inference engines that require efficient "turnkey" sampling algorithms.Comment: 30 pages, 7 figure
Parallel SGD: When does averaging help?
Consider a number of workers running SGD independently on the same pool of
data and averaging the models every once in a while -- a common but not well
understood practice. We study model averaging as a variance-reducing mechanism
and describe two ways in which the frequency of averaging affects convergence.
For convex objectives, we show the benefit of frequent averaging depends on the
gradient variance envelope. For non-convex objectives, we illustrate that this
benefit depends on the presence of multiple globally optimal points. We
complement our findings with multicore experiments on both synthetic and real
data
Uncertainty Quantification in Complex Simulation Models Using Ensemble Copula Coupling
Critical decisions frequently rely on high-dimensional output from complex
computer simulation models that show intricate cross-variable, spatial and
temporal dependence structures, with weather and climate predictions being key
examples. There is a strongly increasing recognition of the need for
uncertainty quantification in such settings, for which we propose and review a
general multi-stage procedure called ensemble copula coupling (ECC), proceeding
as follows: 1. Generate a raw ensemble, consisting of multiple runs of the
computer model that differ in the inputs or model parameters in suitable ways.
2. Apply statistical postprocessing techniques, such as Bayesian model
averaging or nonhomogeneous regression, to correct for systematic errors in the
raw ensemble, to obtain calibrated and sharp predictive distributions for each
univariate output variable individually. 3. Draw a sample from each
postprocessed predictive distribution. 4. Rearrange the sampled values in the
rank order structure of the raw ensemble to obtain the ECC postprocessed
ensemble. The use of ensembles and statistical postprocessing have become
routine in weather forecasting over the past decade. We show that seemingly
unrelated, recent advances can be interpreted, fused and consolidated within
the framework of ECC, the common thread being the adoption of the empirical
copula of the raw ensemble. Depending on the use of Quantiles, Random draws or
Transformations at the sampling stage, we distinguish the ECC-Q, ECC-R and
ECC-T variants, respectively. We also describe relations to the Schaake shuffle
and extant copula-based techniques. In a case study, the ECC approach is
applied to predictions of temperature, pressure, precipitation and wind over
Germany, based on the 50-member European Centre for Medium-Range Weather
Forecasts (ECMWF) ensemble.Comment: Published in at http://dx.doi.org/10.1214/13-STS443 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Probabilistic wind speed forecasting on a grid based on ensemble model output statistics
Probabilistic forecasts of wind speed are important for a wide range of
applications, ranging from operational decision making in connection with wind
power generation to storm warnings, ship routing and aviation. We present a
statistical method that provides locally calibrated, probabilistic wind speed
forecasts at any desired place within the forecast domain based on the output
of a numerical weather prediction (NWP) model. Three approaches for wind speed
post-processing are proposed, which use either truncated normal, gamma or
truncated logistic distributions to make probabilistic predictions about future
observations conditional on the forecasts of an ensemble prediction system
(EPS). In order to provide probabilistic forecasts on a grid, predictive
distributions that were calibrated with local wind speed observations need to
be interpolated. We study several interpolation schemes that combine
geostatistical methods with local information on annual mean wind speeds, and
evaluate the proposed methodology with surface wind speed forecasts over
Germany from the COSMO-DE (Consortium for Small-scale Modelling) ensemble
prediction system.Comment: Published at http://dx.doi.org/10.1214/15-AOAS843 in the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Stochastic L-BFGS: Improved Convergence Rates and Practical Acceleration Strategies
We revisit the stochastic limited-memory BFGS (L-BFGS) algorithm. By
proposing a new framework for the convergence analysis, we prove improved
convergence rates and computational complexities of the stochastic L-BFGS
algorithms compared to previous works. In addition, we propose several
practical acceleration strategies to speed up the empirical performance of such
algorithms. We also provide theoretical analyses for most of the strategies.
Experiments on large-scale logistic and ridge regression problems demonstrate
that our proposed strategies yield significant improvements vis-\`a-vis
competing state-of-the-art algorithms
Using deterministic approximations to accelerate SMC for posterior sampling
Sequential Monte Carlo has become a standard tool for Bayesian Inference of
complex models. This approach can be computationally demanding, especially when
initialized from the prior distribution. On the other hand, deter-ministic
approximations of the posterior distribution are often available with no
theoretical guaranties. We propose a bridge sampling scheme starting from such
a deterministic approximation of the posterior distribution and targeting the
true one. The resulting Shortened Bridge Sampler (SBS) relies on a sequence of
distributions that is determined in an adaptive way. We illustrate the
robustness and the efficiency of the methodology on a large simulation study.
When applied to network datasets, SBS inference leads to different statistical
conclusions from the one supplied by the standard variational Bayes
approximation
- …