37,715 research outputs found
Penalized Orthogonal-Components Regression for Large p Small n Data
We propose a penalized orthogonal-components regression (POCRE) for large p
small n data. Orthogonal components are sequentially constructed to maximize,
upon standardization, their correlation to the response residuals. A new
penalization framework, implemented via empirical Bayes thresholding, is
presented to effectively identify sparse predictors of each component. POCRE is
computationally efficient owing to its sequential construction of leading
sparse principal components. In addition, such construction offers other
properties such as grouping highly correlated predictors and allowing for
collinear or nearly collinear predictors. With multivariate responses, POCRE
can construct common components and thus build up latent-variable models for
large p small n data.Comment: 12 page
Penalized Orthogonal-Components Regression for Large p Small n Data
We propose a penalized orthogonal-components regression (POCRE) for large p
small n data. Orthogonal components are sequentially constructed to maximize,
upon standardization, their correlation to the response residuals. A new
penalization framework, implemented via empirical Bayes thresholding, is
presented to effectively identify sparse predictors of each component. POCRE is
computationally efficient owing to its sequential construction of leading
sparse principal components. In addition, such construction offers other
properties such as grouping highly correlated predictors and allowing for
collinear or nearly collinear predictors. With multivariate responses, POCRE
can construct common components and thus build up latent-variable models for
large p small n data.Comment: 12 page
Inference for feature selection using the Lasso with high-dimensional data
Penalized regression models such as the Lasso have proved useful for variable
selection in many fields - especially for situations with high-dimensional data
where the numbers of predictors far exceeds the number of observations. These
methods identify and rank variables of importance but do not generally provide
any inference of the selected variables. Thus, the variables selected might be
the "most important" but need not be significant. We propose a significance
test for the selection found by the Lasso. We introduce a procedure that
computes inference and p-values for features chosen by the Lasso. This method
rephrases the null hypothesis and uses a randomization approach which ensures
that the error rate is controlled even for small samples. We demonstrate the
ability of the algorithm to compute -values of the expected magnitude with
simulated data using a multitude of scenarios that involve various effects
strengths and correlation between predictors. The algorithm is also applied to
a prostate cancer dataset that has been analyzed in recent papers on the
subject. The proposed method is found to provide a powerful way to make
inference for feature selection even for small samples and when the number of
predictors are several orders of magnitude larger than the number of
observations. The algorithm is implemented in the MESS package in R and is
freely available
Bayesian inference in high-dimensional linear models using an empirical correlation-adaptive prior
In the context of a high-dimensional linear regression model, we propose the
use of an empirical correlation-adaptive prior that makes use of information in
the observed predictor variable matrix to adaptively address high collinearity,
determining if parameters associated with correlated predictors should be
shrunk together or kept apart. Under suitable conditions, we prove that this
empirical Bayes posterior concentrates around the true sparse parameter at the
optimal rate asymptotically. A simplified version of a shotgun stochastic
search algorithm is employed to implement the variable selection procedure, and
we show, via simulation experiments across different settings and a real-data
application, the favorable performance of the proposed method compared to
existing methods.Comment: 25 pages, 4 figures, 2 table
- …