5 research outputs found
Variable selection with LASSO regression for complex survey data
Variable selection is an important step to end up with good prediction models. LASSO regression
models are one of the most commonly used methods for this purpose, for which
cross-validation is the most widely applied validation technique to choose the tuning parameter
(λ). Validation techniques in a complex survey framework are closely related to
“replicate weights”. However, to our knowledge, they have never been used in a LASSO
regression context. Applying LASSO regression models to complex survey data could be
challenging. The goal of this paper is two-fold. On the one hand, we analyze the performance
of replicate weights methods to select the tuning parameter for fitting LASSO regression
models to complex survey data. On the other hand, we propose new replicate weights methods
for the same purpose. In particular, we propose a new design-based cross-validation
method as a combination of the traditional cross-validation and replicate weights. The performance
of all these methods has been analyzed and compared by means of an extensive
simulation study to the traditional cross-validation technique to select the tuning parameter
for LASSO regression models. The results suggest a considerable improvement when
the new proposal design-based cross-validation is used instead of the traditional crossvalidation.IT1456-22
PIF18/21