78 research outputs found
Learning with many experts: model selection and sparsity
Experts classifying data are often imprecise. Recently, several models have
been proposed to train classifiers using the noisy labels generated by these
experts. How to choose between these models? In such situations, the true
labels are unavailable. Thus, one cannot perform model selection using the
standard versions of methods such as empirical risk minimization and cross
validation. In order to allow model selection, we present a surrogate loss and
provide theoretical guarantees that assure its consistency. Next, we discuss
how this loss can be used to tune a penalization which introduces sparsity in
the parameters of a traditional class of models. Sparsity provides more
parsimonious models and can avoid overfitting. Nevertheless, it has seldom been
discussed in the context of noisy labels due to the difficulty in model
selection and, therefore, in choosing tuning parameters. We apply these
techniques to several sets of simulated and real data.Comment: This is the pre-peer reviewed versio
- …