Lasso model selection in multi-dimensional contingency tables?

Abstract

We develop a Smooth Lasso for sparse, high dimensional, contingency tables and compare its performance with the usual Lasso and with the now classical backwards elimination algorithm. In simulation, the usual Lasso had great difficulty identifying the correct model. Irrespective of the sample size, it did not succeed in identifying the correct model in the simulation study! By comparison the smooth Lasso performed better improving with increasing sample size. The backwards elimination algorithm also performed well and was better than the Smooth Lasso at small sample sizes. Another potential difficulty is that Lasso methods do not respect the marginal constraints on hierarchy and so lead to non-hierarchical models which are unscientific. Furthermore, even when one can demonstrate, classically, that some effects in the model are inestimable, the Lasso methods provide penalized estimates. These problems call Lasso methods into question

    Similar works