Search CORE

12,270 research outputs found

Feature Extraction in Signal Regression: A Boosting Technique for Functional Data Regression

Author: Gertheiss Jan
Tutz Gerhard
Publication venue
Publication date: 01/01/2007
Field of study

Main objectives of feature extraction in signal regression are the improvement of accuracy of prediction on future data and identification of relevant parts of the signal. A feature extraction procedure is proposed that uses boosting techniques to select the relevant parts of the signal. The proposed blockwise boosting procedure simultaneously selects intervals in the signal’s domain and estimates the effect on the response. The blocks that are defined explicitly use the underlying metric of the signal. It is demonstrated in simulation studies and for real-world data that the proposed approach competes well with procedures like PLS, P-spline signal regression and functional data regression. The paper is a preprint of an article published in the Journal of Computational and Graphical Statistics. Please use the journal version for citation

CiteSeerX

Open Access LMU

The Smooth-Lasso and other $\ell_1+\ell_2$ -penalized methods

Author: Hebiri Mohamed
Van De Geer Sara A.
Publication venue
Publication date: 07/10/2011
Field of study

We consider a linear regression problem in a high dimensional setting where the number of covariates

p

can be much larger than the sample size

n

. In such a situation, one often assumes sparsity of the regression vector, \textit i.e., the regression vector contains many zero components. We propose a Lasso-type estimator

\hat{\beta}^{Quad}

(where '

Quad

' stands for quadratic) which is based on two penalty terms. The first one is the

\ell_1

norm of the regression coefficients used to exploit the sparsity of the regression as done by the Lasso estimator, whereas the second is a quadratic penalty term introduced to capture some additional information on the setting of the problem. We detail two special cases: the Elastic-Net

\hat{\beta}^{EN}

, which deals with sparse problems where correlations between variables may exist; and the Smooth-Lasso

\hat{\beta}^{SL}

, which responds to sparse problems where successive regression coefficients are known to vary slowly (in some situations, this can also be interpreted in terms of correlations between successive variables). From a theoretical point of view, we establish variable selection consistency results and show that

\hat{\beta}^{Quad}

achieves a Sparsity Inequality, \textit i.e., a bound in terms of the number of non-zero components of the 'true' regression vector. These results are provided under a weaker assumption on the Gram matrix than the one used by the Lasso. In some situations this guarantees a significant improvement over the Lasso. Furthermore, a simulation study is conducted and shows that the S-Lasso

\hat{\beta}^{SL}

performs better than known methods as the Lasso, the Elastic-Net

\hat{\beta}^{EN}

, and the Fused-Lasso with respect to the estimation accuracy. This is especially the case when the regression vector is 'smooth', \textit i.e., when the variations between successive coefficients of the unknown parameter of the regression are small. The study also reveals that the theoretical calibration of the tuning parameters and the one based on 10 fold cross validation imply two S-Lasso solutions with close performance

arXiv.org e-Print Archive

HAL - UPEC / UPEM