Multi-task feature selection methods often make
the hypothesis that learning tasks share relevant
and irrelevant features. However, this hypothesis
may be too restrictive in practice. For example,
there may be a few tasks with specific relevant
and irrelevant features (outlier tasks). Similarly,
a few of the features may be relevant for
only some of the tasks (outlier features). To account
for this, we propose a model for multi-task
feature selection based on a robust prior distribution
that introduces a set of binary latent variables
to identify outlier tasks and outlier features.
Expectation propagation can be used for efficient
approximate inference under the proposed prior.
Several experiments show that a model based on
the new robust prior provides better predictive
performance than other benchmark methods.Daniel Hernández-Lobato gratefully acknowledges the use of the facilities of Centro de Computacin Científica (CCC) at Universidad Autónoma de Madrid. This author also acknowledges financial support from Spanish Plan Nacional I+D+i, Grant TIN2013-42351-P, and from Comunidad de Madrid, Grant S2013/ICE-2845 CASI-CAM-CM. José Miguel Hernández-Lobato acknowledges financial support from the Rafael del Pino Fundation