Search CORE

4,207 research outputs found

Large-scale linear regression: Development of high-performance routines

Author: Bientinesi Paolo
Fabregat-Traver Diego
Frank Alvaro
Publication venue
Publication date: 01/01/2015
Field of study

In statistics, series of ordinary least squares problems (OLS) are used to study the linear correlation among sets of variables of interest; in many studies, the number of such variables is at least in the millions, and the corresponding datasets occupy terabytes of disk space. As the availability of large-scale datasets increases regularly, so does the challenge in dealing with them. Indeed, traditional solvers---which rely on the use of black-box" routines optimized for one single OLS---are highly inefficient and fail to provide a viable solution for big-data analyses. As a case study, in this paper we consider a linear regression consisting of two-dimensional grids of related OLS problems that arise in the context of genome-wide association analyses, and give a careful walkthrough for the development of {\sc ols-grid}, a high-performance routine for shared-memory architectures; analogous steps are relevant for tailoring OLS solvers to other applications. In particular, we first illustrate the design of efficient algorithms that exploit the structure of the OLS problems and eliminate redundant computations; then, we show how to effectively deal with datasets that do not fit in main memory; finally, we discuss how to cast the computation in terms of efficient kernels and how to achieve scalability. Importantly, each design decision along the way is justified by simple performance models. {\sc ols-grid} enables the solution of

10^{11}

correlated OLS problems operating on terabytes of data in a matter of hours

arXiv.org e-Print Archive

Publikationsserver der RWTH Aachen University

Dynamic Bayesian Predictive Synthesis in Time Series Forecasting

Author: McAlinn Kenichiro
West Mike
Publication venue
Publication date: 05/11/2017
Field of study

We discuss model and forecast combination in time series forecasting. A foundational Bayesian perspective based on agent opinion analysis theory defines a new framework for density forecast combination, and encompasses several existing forecast pooling methods. We develop a novel class of dynamic latent factor models for time series forecast synthesis; simulation-based computation enables implementation. These models can dynamically adapt to time-varying biases, miscalibration and inter-dependencies among multiple models or forecasters. A macroeconomic forecasting study highlights the dynamic relationships among synthesized forecast densities, as well as the potential for improved forecast accuracy at multiple horizons

arXiv.org e-Print Archive

Auxiliary Likelihood-Based Approximate Bayesian Computation in State Space Models

Author: Frazier David T.
Maneesoonthorn Worapree
Martin Gael M.
McCabe Brendan P. M.
Robert Christian P.
Publication venue
Publication date: 02/12/2018
Field of study

A computationally simple approach to inference in state space models is proposed, using approximate Bayesian computation (ABC). ABC avoids evaluation of an intractable likelihood by matching summary statistics for the observed data with statistics computed from data simulated from the true process, based on parameter draws from the prior. Draws that produce a 'match' between observed and simulated summaries are retained, and used to estimate the inaccessible posterior. With no reduction to a low-dimensional set of sufficient statistics being possible in the state space setting, we define the summaries as the maximum of an auxiliary likelihood function, and thereby exploit the asymptotic sufficiency of this estimator for the auxiliary parameter vector. We derive conditions under which this approach - including a computationally efficient version based on the auxiliary score - achieves Bayesian consistency. To reduce the well-documented inaccuracy of ABC in multi-parameter settings, we propose the separate treatment of each parameter dimension using an integrated likelihood technique. Three stochastic volatility models for which exact Bayesian inference is either computationally challenging, or infeasible, are used for illustration. We demonstrate that our approach compares favorably against an extensive set of approximate and exact comparators. An empirical illustration completes the paper.Comment: This paper is forthcoming at the Journal of Computational and Graphical Statistics. It also supersedes the earlier arXiv paper "Approximate Bayesian Computation in State Space Models" (arXiv:1409.8363

arXiv.org e-Print Archive

HAL-uB

University of Liverpool Repository

Warwick Research Archives Portal Repository

FigShare

Parallel Sequential Monte Carlo for Efficient Density Combination: The DeCo MATLAB Toolbox

Author: Casarin Roberto
Grassi Stefano
Ravazzolo Francesco
Van Dijk Herman K.
Publication venue: 'Foundation for Open Access Statistic'
Publication date: 01/01/2014
Field of study

This paper presents the Matlab package DeCo (Density Combination) which is based on the paper by Billio et al. (2013) where a constructive Bayesian approach is presented for combining predictive densities originating from different models or other sources of information. The combination weights are time-varying and may depend on past predictive forecasting performances and other learning mechanisms. The core algorithm is the function DeCo which applies banks of parallel Sequential Monte Carlo algorithms to filter the time-varying combination weights. The DeCo procedure has been implemented both for standard CPU computing and for Graphical Process Unit (GPU) parallel computing. For the GPU implementation we use the Matlab parallel computing toolbox and show how to use General Purposes GPU computing almost effortless. This GPU implementation comes with a speed up of the execution time up to seventy times compared to a standard CPU Matlab implementation on a multicore CPU. We show the use of the package and the computational gain of the GPU version, through some simulation experiments and empirical application

Directory of Open Access Journals

Norges Banks vitenarkiv

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Kent Academic Repository

Journal of Statistical Software

Erasmus University Digital Repository

ART

Getting Started with Particle Metropolis-Hastings for Inference in Nonlinear Dynamical Models

Author: Dahlin Johan
Schön Thomas B.
Publication venue
Publication date: 01/01/2019
Field of study

This tutorial provides a gentle introduction to the particle Metropolis-Hastings (PMH) algorithm for parameter inference in nonlinear state-space models together with a software implementation in the statistical programming language R. We employ a step-by-step approach to develop an implementation of the PMH algorithm (and the particle filter within) together with the reader. This final implementation is also available as the package pmhtutorial in the CRAN repository. Throughout the tutorial, we provide some intuition as to how the algorithm operates and discuss some solutions to problems that might occur in practice. To illustrate the use of PMH, we consider parameter inference in a linear Gaussian state-space model with synthetic data and a nonlinear stochastic volatility model with real-world data.Comment: 41 pages, 7 figures. In press for Journal of Statistical Software. Source code for R, Python and MATLAB available at: https://github.com/compops/pmh-tutoria

arXiv.org e-Print Archive

Publikationer från Linköpings universitet

Publikationer från Uppsala Universitet

Directory of Open Access Journals

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Journal of Statistical Software