5,290 research outputs found
Penalized Orthogonal-Components Regression for Large p Small n Data
We propose a penalized orthogonal-components regression (POCRE) for large p
small n data. Orthogonal components are sequentially constructed to maximize,
upon standardization, their correlation to the response residuals. A new
penalization framework, implemented via empirical Bayes thresholding, is
presented to effectively identify sparse predictors of each component. POCRE is
computationally efficient owing to its sequential construction of leading
sparse principal components. In addition, such construction offers other
properties such as grouping highly correlated predictors and allowing for
collinear or nearly collinear predictors. With multivariate responses, POCRE
can construct common components and thus build up latent-variable models for
large p small n data.Comment: 12 page
Sparse Vector Autoregressive Modeling
The vector autoregressive (VAR) model has been widely used for modeling
temporal dependence in a multivariate time series. For large (and even
moderate) dimensions, the number of AR coefficients can be prohibitively large,
resulting in noisy estimates, unstable predictions and difficult-to-interpret
temporal dependence. To overcome such drawbacks, we propose a 2-stage approach
for fitting sparse VAR (sVAR) models in which many of the AR coefficients are
zero. The first stage selects non-zero AR coefficients based on an estimate of
the partial spectral coherence (PSC) together with the use of BIC. The PSC is
useful for quantifying the conditional relationship between marginal series in
a multivariate process. A refinement second stage is then applied to further
reduce the number of parameters. The performance of this 2-stage approach is
illustrated with simulation results. The 2-stage approach is also applied to
two real data examples: the first is the Google Flu Trends data and the second
is a time series of concentration levels of air pollutants.Comment: 39 pages, 7 figure
Penalized Orthogonal-Components Regression for Large p Small n Data
We propose a penalized orthogonal-components regression (POCRE) for large p
small n data. Orthogonal components are sequentially constructed to maximize,
upon standardization, their correlation to the response residuals. A new
penalization framework, implemented via empirical Bayes thresholding, is
presented to effectively identify sparse predictors of each component. POCRE is
computationally efficient owing to its sequential construction of leading
sparse principal components. In addition, such construction offers other
properties such as grouping highly correlated predictors and allowing for
collinear or nearly collinear predictors. With multivariate responses, POCRE
can construct common components and thus build up latent-variable models for
large p small n data.Comment: 12 page
A General Framework for Fast Stagewise Algorithms
Forward stagewise regression follows a very simple strategy for constructing
a sequence of sparse regression estimates: it starts with all coefficients
equal to zero, and iteratively updates the coefficient (by a small amount
) of the variable that achieves the maximal absolute inner product
with the current residual. This procedure has an interesting connection to the
lasso: under some conditions, it is known that the sequence of forward
stagewise estimates exactly coincides with the lasso path, as the step size
goes to zero. Furthermore, essentially the same equivalence holds
outside of least squares regression, with the minimization of a differentiable
convex loss function subject to an norm constraint (the stagewise
algorithm now updates the coefficient corresponding to the maximal absolute
component of the gradient).
Even when they do not match their -constrained analogues, stagewise
estimates provide a useful approximation, and are computationally appealing.
Their success in sparse modeling motivates the question: can a simple,
effective strategy like forward stagewise be applied more broadly in other
regularization settings, beyond the norm and sparsity? The current
paper is an attempt to do just this. We present a general framework for
stagewise estimation, which yields fast algorithms for problems such as
group-structured learning, matrix completion, image denoising, and more.Comment: 56 pages, 15 figure
Sparse canonical correlation analysis from a predictive point of view
Canonical correlation analysis (CCA) describes the associations between two
sets of variables by maximizing the correlation between linear combinations of
the variables in each data set. However, in high-dimensional settings where the
number of variables exceeds the sample size or when the variables are highly
correlated, traditional CCA is no longer appropriate. This paper proposes a
method for sparse CCA. Sparse estimation produces linear combinations of only a
subset of variables from each data set, thereby increasing the interpretability
of the canonical variates. We consider the CCA problem from a predictive point
of view and recast it into a regression framework. By combining an alternating
regression approach together with a lasso penalty, we induce sparsity in the
canonical vectors. We compare the performance with other sparse CCA techniques
in different simulation settings and illustrate its usefulness on a genomic
data set
- …