816 research outputs found
A Novel Approach to Forecasting High Dimensional S&P500 Portfolio Using VARX Model with Information Complexity
This study considers vector autoregressive models that allow for endogenous and exogeneous regressors VARX using multivariate OLS regression. For the model selection, we follow bozdogan’s entropic or information-theoretic measure of complexity ICOMP criterion of the estimated inverse Fisher information matrix IFIM in choosing the best VARX lag parameter and we established that ICOMP outperform the conventional information criteria. As an empirical illustration, we reduced the dimension of the S&P500 multivariate time series using Sparse Principal Component Analysis (SPCA) and chose the best subset of 37 stocks belonging to six sectors. We then performed a portfolio of stocks based on the highest SPC loading weight matrix, plus the S&P500 index. Furthermore, we applied the proposed VARX model to predict the price movements in the constructed portfolio, where the S&P500 index was treated as an exogeneous regressor of the VARX model. It has been deduced too that the buy-sell decision making in response to VARX (4,0) for a stock outperforms investing and holding the stock over the out-of-sample period
Covariance Estimation: The GLM and Regularization Perspectives
Finding an unconstrained and statistically interpretable reparameterization
of a covariance matrix is still an open problem in statistics. Its solution is
of central importance in covariance estimation, particularly in the recent
high-dimensional data environment where enforcing the positive-definiteness
constraint could be computationally expensive. We provide a survey of the
progress made in modeling covariance matrices from two relatively complementary
perspectives: (1) generalized linear models (GLM) or parsimony and use of
covariates in low dimensions, and (2) regularization or sparsity for
high-dimensional data. An emerging, unifying and powerful trend in both
perspectives is that of reducing a covariance estimation problem to that of
estimating a sequence of regression problems. We point out several instances of
the regression-based formulation. A notable case is in sparse estimation of a
precision matrix or a Gaussian graphical model leading to the fast graphical
LASSO algorithm. Some advantages and limitations of the regression-based
Cholesky decomposition relative to the classical spectral (eigenvalue) and
variance-correlation decompositions are highlighted. The former provides an
unconstrained and statistically interpretable reparameterization, and
guarantees the positive-definiteness of the estimated covariance matrix. It
reduces the unintuitive task of covariance estimation to that of modeling a
sequence of regressions at the cost of imposing an a priori order among the
variables. Elementwise regularization of the sample covariance matrix such as
banding, tapering and thresholding has desirable asymptotic properties and the
sparse estimated covariance matrix is positive definite with probability
tending to one for large samples and dimensions.Comment: Published in at http://dx.doi.org/10.1214/11-STS358 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Synergizing Roughness Penalization and Basis Selection in Bayesian Spline Regression
Bayesian P-splines and basis determination through Bayesian model selection
are both commonly employed strategies for nonparametric regression using spline
basis expansions within the Bayesian framework. Despite their widespread use,
each method has particular limitations that may introduce potential estimation
bias depending on the nature of the target function. To overcome the
limitations associated with each method while capitalizing on their respective
strengths, we propose a new prior distribution that integrates the essentials
of both approaches. The proposed prior distribution assesses the complexity of
the spline model based on a penalty term formed by a convex combination of the
penalties from both methods. The proposed method exhibits adaptability to the
unknown level of smoothness, while achieving the minimax-optimal posterior
contraction rate up to a logarithmic factor. We provide an efficient Markov
chain Monte Carlo algorithm for implementing the proposed approach. Our
extensive simulation study reveals that the proposed method outperforms other
competitors in terms of performance metrics or model complexity
Recommended from our members
Mixed Methods for Mixed Models
This work bridges the frequentist and Bayesian approaches to mixed models by borrowing the best features from both camps: point estimation procedures are combined with priors to obtain accurate, fast inference while posterior simulation techniques are developed that approximate the likelihood with great precision for the purposes of assessing uncertainty. These allow flexible inferences without the need to rely on expensive Markov chain Monte Carlo simulation techniques. Default priors are developed and evaluated in a variety of simulation and real-world settings with the end result that we propose a new set of standard approaches that yield superior performance at little computational cost
- …