12,930 research outputs found

    Sparse cointegration

    Full text link
    Cointegration analysis is used to estimate the long-run equilibrium relations between several time series. The coefficients of these long-run equilibrium relations are the cointegrating vectors. In this paper, we provide a sparse estimator of the cointegrating vectors. The estimation technique is sparse in the sense that some elements of the cointegrating vectors will be estimated as zero. For this purpose, we combine a penalized estimation procedure for vector autoregressive models with sparse reduced rank regression. The sparse cointegration procedure achieves a higher estimation accuracy than the traditional Johansen cointegration approach in settings where the true cointegrating vectors have a sparse structure, and/or when the sample size is low compared to the number of time series. We also discuss a criterion to determine the cointegration rank and we illustrate its good performance in several simulation settings. In a first empirical application we investigate whether the expectations hypothesis of the term structure of interest rates, implying sparse cointegrating vectors, holds in practice. In a second empirical application we show that forecast performance in high-dimensional systems can be improved by sparsely estimating the cointegration relations

    High-Dimensional Bayesian Geostatistics

    Full text link
    With the growing capabilities of Geographic Information Systems (GIS) and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points. Over the last decade, hierarchical spatiotemporal process models have become widely deployed statistical tools for researchers to better understand the complex nature of spatial and temporal variability. However, fitting hierarchical spatiotemporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. This renders such models unfeasible for large data sets. This article offers a focused review of two methods for constructing well-defined highly scalable spatiotemporal stochastic processes. Both these processes can be used as "priors" for spatiotemporal random fields. The first approach constructs a low-rank process operating on a lower-dimensional subspace. The second approach constructs a Nearest-Neighbor Gaussian Process (NNGP) that ensures sparse precision matrices for its finite realizations. Both processes can be exploited as a scalable prior embedded within a rich hierarchical modeling framework to deliver full Bayesian inference. These approaches can be described as model-based solutions for big spatiotemporal datasets. The models ensure that the algorithmic complexity has ∼n\sim n floating point operations (flops), where nn the number of spatial locations (per iteration). We compare these methods and provide some insight into their methodological underpinnings

    Bayesian factorizations of big sparse tensors

    Full text link
    It has become routine to collect data that are structured as multiway arrays (tensors). There is an enormous literature on low rank and sparse matrix factorizations, but limited consideration of extensions to the tensor case in statistics. The most common low rank tensor factorization relies on parallel factor analysis (PARAFAC), which expresses a rank kk tensor as a sum of rank one tensors. When observations are only available for a tiny subset of the cells of a big tensor, the low rank assumption is not sufficient and PARAFAC has poor performance. We induce an additional layer of dimension reduction by allowing the effective rank to vary across dimensions of the table. For concreteness, we focus on a contingency table application. Taking a Bayesian approach, we place priors on terms in the factorization and develop an efficient Gibbs sampler for posterior computation. Theory is provided showing posterior concentration rates in high-dimensional settings, and the methods are shown to have excellent performance in simulations and several real data applications

    Covariance approximation for large multivariate spatial data sets with an application to multiple climate model errors

    Get PDF
    This paper investigates the cross-correlations across multiple climate model errors. We build a Bayesian hierarchical model that accounts for the spatial dependence of individual models as well as cross-covariances across different climate models. Our method allows for a nonseparable and nonstationary cross-covariance structure. We also present a covariance approximation approach to facilitate the computation in the modeling and analysis of very large multivariate spatial data sets. The covariance approximation consists of two parts: a reduced-rank part to capture the large-scale spatial dependence, and a sparse covariance matrix to correct the small-scale dependence error induced by the reduced rank approximation. We pay special attention to the case that the second part of the approximation has a block-diagonal structure. Simulation results of model fitting and prediction show substantial improvement of the proposed approximation over the predictive process approximation and the independent blocks analysis. We then apply our computational approach to the joint statistical modeling of multiple climate model errors.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS478 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • …
    corecore