4,207 research outputs found

    Large-scale linear regression: Development of high-performance routines

    Full text link
    In statistics, series of ordinary least squares problems (OLS) are used to study the linear correlation among sets of variables of interest; in many studies, the number of such variables is at least in the millions, and the corresponding datasets occupy terabytes of disk space. As the availability of large-scale datasets increases regularly, so does the challenge in dealing with them. Indeed, traditional solvers---which rely on the use of black-box" routines optimized for one single OLS---are highly inefficient and fail to provide a viable solution for big-data analyses. As a case study, in this paper we consider a linear regression consisting of two-dimensional grids of related OLS problems that arise in the context of genome-wide association analyses, and give a careful walkthrough for the development of {\sc ols-grid}, a high-performance routine for shared-memory architectures; analogous steps are relevant for tailoring OLS solvers to other applications. In particular, we first illustrate the design of efficient algorithms that exploit the structure of the OLS problems and eliminate redundant computations; then, we show how to effectively deal with datasets that do not fit in main memory; finally, we discuss how to cast the computation in terms of efficient kernels and how to achieve scalability. Importantly, each design decision along the way is justified by simple performance models. {\sc ols-grid} enables the solution of 101110^{11} correlated OLS problems operating on terabytes of data in a matter of hours

    Dynamic Bayesian Predictive Synthesis in Time Series Forecasting

    Full text link
    We discuss model and forecast combination in time series forecasting. A foundational Bayesian perspective based on agent opinion analysis theory defines a new framework for density forecast combination, and encompasses several existing forecast pooling methods. We develop a novel class of dynamic latent factor models for time series forecast synthesis; simulation-based computation enables implementation. These models can dynamically adapt to time-varying biases, miscalibration and inter-dependencies among multiple models or forecasters. A macroeconomic forecasting study highlights the dynamic relationships among synthesized forecast densities, as well as the potential for improved forecast accuracy at multiple horizons

    Auxiliary Likelihood-Based Approximate Bayesian Computation in State Space Models

    Get PDF
    A computationally simple approach to inference in state space models is proposed, using approximate Bayesian computation (ABC). ABC avoids evaluation of an intractable likelihood by matching summary statistics for the observed data with statistics computed from data simulated from the true process, based on parameter draws from the prior. Draws that produce a 'match' between observed and simulated summaries are retained, and used to estimate the inaccessible posterior. With no reduction to a low-dimensional set of sufficient statistics being possible in the state space setting, we define the summaries as the maximum of an auxiliary likelihood function, and thereby exploit the asymptotic sufficiency of this estimator for the auxiliary parameter vector. We derive conditions under which this approach - including a computationally efficient version based on the auxiliary score - achieves Bayesian consistency. To reduce the well-documented inaccuracy of ABC in multi-parameter settings, we propose the separate treatment of each parameter dimension using an integrated likelihood technique. Three stochastic volatility models for which exact Bayesian inference is either computationally challenging, or infeasible, are used for illustration. We demonstrate that our approach compares favorably against an extensive set of approximate and exact comparators. An empirical illustration completes the paper.Comment: This paper is forthcoming at the Journal of Computational and Graphical Statistics. It also supersedes the earlier arXiv paper "Approximate Bayesian Computation in State Space Models" (arXiv:1409.8363

    Parallel Sequential Monte Carlo for Efficient Density Combination: The DeCo MATLAB Toolbox

    Get PDF
    This paper presents the Matlab package DeCo (Density Combination) which is based on the paper by Billio et al. (2013) where a constructive Bayesian approach is presented for combining predictive densities originating from different models or other sources of information. The combination weights are time-varying and may depend on past predictive forecasting performances and other learning mechanisms. The core algorithm is the function DeCo which applies banks of parallel Sequential Monte Carlo algorithms to filter the time-varying combination weights. The DeCo procedure has been implemented both for standard CPU computing and for Graphical Process Unit (GPU) parallel computing. For the GPU implementation we use the Matlab parallel computing toolbox and show how to use General Purposes GPU computing almost effortless. This GPU implementation comes with a speed up of the execution time up to seventy times compared to a standard CPU Matlab implementation on a multicore CPU. We show the use of the package and the computational gain of the GPU version, through some simulation experiments and empirical application

    Getting Started with Particle Metropolis-Hastings for Inference in Nonlinear Dynamical Models

    Get PDF
    This tutorial provides a gentle introduction to the particle Metropolis-Hastings (PMH) algorithm for parameter inference in nonlinear state-space models together with a software implementation in the statistical programming language R. We employ a step-by-step approach to develop an implementation of the PMH algorithm (and the particle filter within) together with the reader. This final implementation is also available as the package pmhtutorial in the CRAN repository. Throughout the tutorial, we provide some intuition as to how the algorithm operates and discuss some solutions to problems that might occur in practice. To illustrate the use of PMH, we consider parameter inference in a linear Gaussian state-space model with synthetic data and a nonlinear stochastic volatility model with real-world data.Comment: 41 pages, 7 figures. In press for Journal of Statistical Software. Source code for R, Python and MATLAB available at: https://github.com/compops/pmh-tutoria
    • …
    corecore