Search CORE

743 research outputs found

Covariance Estimation: The GLM and Regularization Perspectives

Author: Pourahmadi Mohsen
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2011
Field of study

Finding an unconstrained and statistically interpretable reparameterization of a covariance matrix is still an open problem in statistics. Its solution is of central importance in covariance estimation, particularly in the recent high-dimensional data environment where enforcing the positive-definiteness constraint could be computationally expensive. We provide a survey of the progress made in modeling covariance matrices from two relatively complementary perspectives: (1) generalized linear models (GLM) or parsimony and use of covariates in low dimensions, and (2) regularization or sparsity for high-dimensional data. An emerging, unifying and powerful trend in both perspectives is that of reducing a covariance estimation problem to that of estimating a sequence of regression problems. We point out several instances of the regression-based formulation. A notable case is in sparse estimation of a precision matrix or a Gaussian graphical model leading to the fast graphical LASSO algorithm. Some advantages and limitations of the regression-based Cholesky decomposition relative to the classical spectral (eigenvalue) and variance-correlation decompositions are highlighted. The former provides an unconstrained and statistically interpretable reparameterization, and guarantees the positive-definiteness of the estimated covariance matrix. It reduces the unintuitive task of covariance estimation to that of modeling a sequence of regressions at the cost of imposing an a priori order among the variables. Elementwise regularization of the sample covariance matrix such as banding, tapering and thresholding has desirable asymptotic properties and the sparse estimated covariance matrix is positive definite with probability tending to one for large samples and dimensions.Comment: Published in at http://dx.doi.org/10.1214/11-STS358 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Texas A&M Repository

Bayesian forecasting and scalable multivariate volatility analysis using simultaneous graphical dynamic models

Author: Gruber Lutz F.
West Mike
Publication venue: 'Elsevier BV'
Publication date: 27/06/2016
Field of study

The recently introduced class of simultaneous graphical dynamic linear models (SGDLMs) defines an ability to scale on-line Bayesian analysis and forecasting to higher-dimensional time series. This paper advances the methodology of SGDLMs, developing and embedding a novel, adaptive method of simultaneous predictor selection in forward filtering for on-line learning and forecasting. The advances include developments in Bayesian computation for scalability, and a case study in exploring the resulting potential for improved short-term forecasting of large-scale volatility matrices. A case study concerns financial forecasting and portfolio optimization with a 400-dimensional series of daily stock prices. Analysis shows that the SGDLM forecasts volatilities and co-volatilities well, making it ideally suited to contributing to quantitative investment strategies to improve portfolio returns. We also identify performance metrics linked to the sequential Bayesian filtering analysis that turn out to define a leading indicator of increased financial market stresses, comparable to but leading the standard St. Louis Fed Financial Stress Index (STLFSI) measure. Parallel computation using GPU implementations substantially advance the ability to fit and use these models.Comment: 28 pages, 9 figures, 7 table

arXiv.org e-Print Archive

topicmodels: An R Package for Fitting Topic Models

Author: Bettina Grün
Kurt Hornik
Publication venue
Publication date
Field of study

Topic models allow the probabilistic modeling of term frequency occurrences in documents. The fitted model can be used to estimate the similarity between documents as well as between a set of specified keywords using an additional layer of latent variables which are referred to as topics. The R package topicmodels provides basic infrastructure for fitting topic models based on data structures from the text mining package tm. The package includes interfaces to two algorithms for fitting topic models: the variational expectation-maximization algorithm provided by David M. Blei and co-authors and an algorithm using Gibbs sampling by Xuan-Hieu Phan and co-authors.

Research Papers in Economics

Optimization with Sparsity-Inducing Penalties

Author: Bach Francis
Jenatton Rodolphe
Mairal Julien
Obozinski Guillaume
Publication venue
Publication date: 01/01/2011
Field of study

Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. They were first dedicated to linear variable selection but numerous extensions have now emerged such as structured sparsity or kernel selection. It turns out that many of the related estimation problems can be cast as convex optimization problems by regularizing the empirical risk with appropriate non-smooth norms. The goal of this paper is to present from a general perspective optimization tools and techniques dedicated to such sparsity-inducing penalties. We cover proximal methods, block-coordinate descent, reweighted

\ell_2

-penalized techniques, working-set and homotopy methods, as well as non-convex formulations and extensions, and provide an extensive set of experiments to compare various algorithms from a computational point of view

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

HAL-Rennes 1

Monitoring multicountry macroeconomic risk

Author: Korobilis Dimitris
Schröder Maximilian
Publication venue: Norges Bank
Publication date: 01/01/2023
Field of study

We propose a multicountry quantile factor augmeneted vector autoregression (QFAVAR) to model heterogeneities both across countries and across characteristics of the distributions of macroeconomic time series. The presence of quantile factors allows for summarizing these two heterogeneities in a parsimonious way. We develop two algorithms for posterior inference that feature varying level of trade-off between estimation precision and computational speed. Using monthly data for the euro area, we establish the good empirical properties of the QFAVAR as a tool for assessing the e ects of global shocks on country-level macroeconomic risks. In particular, QFAVAR short-run tail forecasts are more accurate compared to a FAVAR with symmetric Gaussian errors, as well as univariate quantile autoregressions that ignore comovements among quantiles of macroeconomic variables. We also illustrate how quantile impulse response functions and quantile connectedness measures, resulting from the new model, can be used to implemennt joint risk scenario analysis.publishedVersio

Norges Banks vitenarkiv

Recommended from our members

Unsupervised Representation Learning with Correlations

Author: Tang Da
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2020
Field of study

Unsupervised representation learning algorithms have been playing important roles in machine learning and related fields. However, due to optimization intractability or lack of consideration in given data correlation structures, some unsupervised representation learning algorithms still cannot well discover the inherent features from the data, under certain circumstances. This thesis extends these algorithms, and improves over the above issues by taking data correlations into consideration. We study three different aspects of improvements on unsupervised representation learning algorithms by utilizing correlation information, via the following three tasks respectively: 1. Using estimated correlations between data points to provide smart optimization initializations, for multi-way matching (Chapter 2). In this work, we define a correlation score between pairs of data points as metrics for correlations, and initialize all the permutation matrices along a maximum spanning tree of the undirected graph with these metrics as the weights. 2. Faster optimization by utilizing the correlations in the observations, for variational inference (Chapter 3). We construct a positive definite matrix from the negative Hessian of the log-likelihood part of the objective that can capture the influence of the observation correlations on the parameter vector. We then use the inverse of this matrix to rescale the gradient. 3. Utilizing additional side-information on data correlation structures to explicitly learn correlations between data points, for extensions of Variational Auto-Encoders (VAEs) (Chapters 4 and 5). Consider the case where we know a correlation graph G of the data points. Instead of placing an i.i.d. prior as in the most common setting, we adopt correlated priors and/or correlated variational distributions on the latent variables through utilizing the graph G. Empirical results on these tasks show the success of the proposed methods in improving the performances of unsupervised representation learning algorithms. We compare our methods with multiple recent advanced algorithms on various tasks, on both synthetic and real datasets. We also provide theoretical analysis for some of the proposed methods, showing their advantages under certain situations. The proposed methods have wide ranges of applications. For examples, image compression (via smart initializations for multi-way matching), link prediction (by VAEs with correlations), etc

Columbia University Academic Commons