111 research outputs found
Don't Fall for Tuning Parameters: Tuning-Free Variable Selection in High Dimensions With the TREX
Lasso is a seminal contribution to high-dimensional statistics, but it hinges
on a tuning parameter that is difficult to calibrate in practice. A partial
remedy for this problem is Square-Root Lasso, because it inherently calibrates
to the noise variance. However, Square-Root Lasso still requires the
calibration of a tuning parameter to all other aspects of the model. In this
study, we introduce TREX, an alternative to Lasso with an inherent calibration
to all aspects of the model. This adaptation to the entire model renders TREX
an estimator that does not require any calibration of tuning parameters. We
show that TREX can outperform cross-validated Lasso in terms of variable
selection and computational efficiency. We also introduce a bootstrapped
version of TREX that can further improve variable selection. We illustrate the
promising performance of TREX both on synthetic data and on a recent
high-dimensional biological data set that considers riboflavin production in B.
subtilis
Compute Less to Get More: Using ORC to Improve Sparse Filtering
Sparse Filtering is a popular feature learning algorithm for image
classification pipelines. In this paper, we connect the performance of Sparse
Filtering with spectral properties of the corresponding feature matrices. This
connection provides new insights into Sparse Filtering; in particular, it
suggests early stopping of Sparse Filtering. We therefore introduce the Optimal
Roundness Criterion (ORC), a novel stopping criterion for Sparse Filtering. We
show that this stopping criterion is related with pre-processing procedures
such as Statistical Whitening and demonstrate that it can make image
classification with Sparse Filtering considerably faster and more accurate
How Correlations Influence Lasso Prediction
We study how correlations in the design matrix influence Lasso prediction.
First, we argue that the higher the correlations are, the smaller the optimal
tuning parameter is. This implies in particular that the standard tuning
parameters, that do not depend on the design matrix, are not favorable.
Furthermore, we argue that Lasso prediction works well for any degree of
correlations if suitable tuning parameters are chosen. We study these two
subjects theoretically as well as with simulations
Optimal Two-Step Prediction in Regression
High-dimensional prediction typically comprises two steps: variable selection
and subsequent least-squares refitting on the selected variables. However, the
standard variable selection procedures, such as the lasso, hinge on tuning
parameters that need to be calibrated. Cross-validation, the most popular
calibration scheme, is computationally costly and lacks finite sample
guarantees. In this paper, we introduce an alternative scheme, easy to
implement and both computationally and theoretically efficient
- …