134,789 research outputs found
Universum Prescription: Regularization using Unlabeled Data
This paper shows that simply prescribing "none of the above" labels to
unlabeled data has a beneficial regularization effect to supervised learning.
We call it universum prescription by the fact that the prescribed labels cannot
be one of the supervised labels. In spite of its simplicity, universum
prescription obtained competitive results in training deep convolutional
networks for CIFAR-10, CIFAR-100, STL-10 and ImageNet datasets. A qualitative
justification of these approaches using Rademacher complexity is presented. The
effect of a regularization parameter -- probability of sampling from unlabeled
data -- is also studied empirically.Comment: 7 pages for article, 3 pages for supplemental material. To appear in
AAAI-1
Learning From Labeled And Unlabeled Data: An Empirical Study Across Techniques And Domains
There has been increased interest in devising learning techniques that
combine unlabeled data with labeled data ? i.e. semi-supervised learning.
However, to the best of our knowledge, no study has been performed across
various techniques and different types and amounts of labeled and unlabeled
data. Moreover, most of the published work on semi-supervised learning
techniques assumes that the labeled and unlabeled data come from the same
distribution. It is possible for the labeling process to be associated with a
selection bias such that the distributions of data points in the labeled and
unlabeled sets are different. Not correcting for such bias can result in biased
function approximation with potentially poor performance. In this paper, we
present an empirical study of various semi-supervised learning techniques on a
variety of datasets. We attempt to answer various questions such as the effect
of independence or relevance amongst features, the effect of the size of the
labeled and unlabeled sets and the effect of noise. We also investigate the
impact of sample-selection bias on the semi-supervised learning techniques
under study and implement a bivariate probit technique particularly designed to
correct for such bias
- …
