3,050 research outputs found
Flexible shrinkage in high-dimensional Bayesian spatial autoregressive models
This article introduces two absolutely continuous global-local shrinkage
priors to enable stochastic variable selection in the context of
high-dimensional matrix exponential spatial specifications. Existing approaches
as a means to dealing with overparameterization problems in spatial
autoregressive specifications typically rely on computationally demanding
Bayesian model-averaging techniques. The proposed shrinkage priors can be
implemented using Markov chain Monte Carlo methods in a flexible and efficient
way. A simulation study is conducted to evaluate the performance of each of the
shrinkage priors. Results suggest that they perform particularly well in
high-dimensional environments, especially when the number of parameters to
estimate exceeds the number of observations. For an empirical illustration we
use pan-European regional economic growth data.Comment: Keywords: Matrix exponential spatial specification, model selection,
shrinkage priors, hierarchical modeling; JEL: C11, C21, C5
Estimation of the spatial weighting matrix for regular lattice data—An adaptive lasso approach with cross-sectional resampling
Spatial autoregressive models typically rely on the assumption that the spatial dependence structure is known in advance and is represented by a deterministic spatial weights matrix, although it is unknown in most empirical applications. Thus, we investigate the estimation of sparse spatial dependence structures for regular lattice data. In particular, an adaptive least absolute shrinkage and selection operator (lasso) is used to select and estimate the individual nonzero connections of the spatial weights matrix. To recover the spatial dependence structure, we propose cross-sectional resampling, assuming that the random process is exchangeable. The estimation procedure is based on a two-step approach to circumvent simultaneity issues that typically arise from endogenous spatial autoregressive dependencies. The two-step adaptive lasso approach with cross-sectional resampling is verified using Monte Carlo simulations. Eventually, we apply the procedure to model nitrogen dioxide (Formula presented.) concentrations and show that estimating the spatial dependence structure contrary to using prespecified weights matrices improves the prediction accuracy considerably. © 2021 The Authors. Environmetrics published by John Wiley & Sons, Ltd
Large Vector Auto Regressions
One popular approach for nonstructural economic and financial forecasting is
to include a large number of economic and financial variables, which has been
shown to lead to significant improvements for forecasting, for example, by the
dynamic factor models. A challenging issue is to determine which variables and
(their) lags are relevant, especially when there is a mixture of serial
correlation (temporal dynamics), high dimensional (spatial) dependence
structure and moderate sample size (relative to dimensionality and lags). To
this end, an \textit{integrated} solution that addresses these three challenges
simultaneously is appealing. We study the large vector auto regressions here
with three types of estimates. We treat each variable's own lags different from
other variables' lags, distinguish various lags over time, and is able to
select the variables and lags simultaneously. We first show the consequences of
using Lasso type estimate directly for time series without considering the
temporal dependence. In contrast, our proposed method can still produce an
estimate as efficient as an \textit{oracle} under such scenarios. The tuning
parameters are chosen via a data driven "rolling scheme" method to optimize the
forecasting performance. A macroeconomic and financial forecasting problem is
considered to illustrate its superiority over existing estimators
Large Vector Auto Regressions
One popular approach for nonstructural economic and financial forecasting is to include a large number of economic and financial variables, which has been shown to lead to significant improvements for forecasting, for example, by the dynamic factor models. A challenging issue is to determine which variables and (their) lags are relevant, especially when there is a mixture of serial correlation (temporal dynamics), high dimensional (spatial) dependence structure and moderate sample size (relative to dimensionality and lags). To this end, an integrated solution that addresses these three challenges simultaneously is appealing. We study the large vector auto regressions here with three types of estimates. We treat each variable's own lags different from other variables' lags, distinguish various lags over time, and is able to select the variables and lags simultaneously. We first show the consequences of using Lasso type estimate directly for time series without considering the temporal dependence. In contrast, our proposed method can still produce an estimate as efficient as an oracle under such scenarios. The tuning parameters are chosen via a data driven "rolling scheme" method to optimize the forecasting performance. A macroeconomic and financial forecasting problem is considered to illustrate its superiority over existing estimators.Time Series, Vector Auto Regression, Regularization, Lasso, Group Lasso, Oracle estimator
Estimation of the spatial weighting matrix for regular lattice data -- An adaptive lasso approach with cross-sectional resampling
Spatial econometric research typically relies on the assumption that the
spatial dependence structure is known in advance and is represented by a
deterministic spatial weights matrix. Contrary to classical approaches, we
investigate the estimation of sparse spatial dependence structures for regular
lattice data. In particular, an adaptive least absolute shrinkage and selection
operator (lasso) is used to select and estimate the individual connections of
the spatial weights matrix. To recover the spatial dependence structure, we
propose cross-sectional resampling, assuming that the random process is
exchangeable. The estimation procedure is based on a two-step approach to
circumvent simultaneity issues that typically arise from endogenous spatial
autoregressive dependencies. The two-step adaptive lasso approach with
cross-sectional resampling is verified using Monte Carlo simulations.
Eventually, we apply the procedure to model nitrogen dioxide ()
concentrations and show that estimating the spatial dependence structure
contrary to using prespecified weights matrices improves the prediction
accuracy considerably
Gene ranking and biomarker discovery under correlation
Biomarker discovery and gene ranking is a standard task in genomic high
throughput analysis. Typically, the ordering of markers is based on a
stabilized variant of the t-score, such as the moderated t or the SAM
statistic. However, these procedures ignore gene-gene correlations, which may
have a profound impact on the gene orderings and on the power of the subsequent
tests.
We propose a simple procedure that adjusts gene-wise t-statistics to take
account of correlations among genes. The resulting correlation-adjusted
t-scores ("cat" scores) are derived from a predictive perspective, i.e. as a
score for variable selection to discriminate group membership in two-class
linear discriminant analysis. In the absence of correlation the cat score
reduces to the standard t-score. Moreover, using the cat score it is
straightforward to evaluate groups of features (i.e. gene sets). For
computation of the cat score from small sample data we propose a shrinkage
procedure. In a comparative study comprising six different synthetic and
empirical correlation structures we show that the cat score improves estimation
of gene orderings and leads to higher power for fixed true discovery rate, and
vice versa. Finally, we also illustrate the cat score by analyzing metabolomic
data.
The shrinkage cat score is implemented in the R package "st" available from
URL http://cran.r-project.org/web/packages/st/Comment: 18 pages, 5 figures, 1 tabl
- …