We consider the estimation of regression models on strata defined using a
categorical covariate, in order to identify interactions between this
categorical covariate and the other predictors. A basic approach requires the
choice of a reference stratum. We show that the performance of a penalized
version of this approach depends on this arbitrary choice. We propose a refined
approach that bypasses this arbitrary choice, at almost no additional
computational cost. Regarding model selection consistency, our proposal mimics
the strategy based on an optimal and covariate-specific choice for the
reference stratum. Results from an empirical study confirm that our proposal
generally outperforms the basic approach in the identification and description
of the interactions. An illustration on gene expression data is provided.Comment: 23 pages, 5 figure