Cost-efficient Variable Selection Using Branching LARS

Abstract

Variable selection is a difficult problem in statistical model building. Identification of cost efficient diagnostic factors is very important to health researchers, but most variable selection methods do not take into account the cost of collecting data for the predictors. The trade off between statistical significance and cost of collecting data for the statistical model is our focus. A Branching LARS (BLARS) procedure has been developed that can select and estimate the important predictors to build a model not only good at prediction but also cost efficient. BLARS method is an extension of the LARS variable selection method to incorporate various costs of factors, where branch and bound search method is employed to accelerate the search process. Both additive and non-additive costs will be addressed. The R package branchLars which implements BLARS will be described. We will show that a cheaper model could be selected by sacrificing a user selected amount of model accuracy

    Similar works