3,998 research outputs found
Semi-supervised Learning based on Distributionally Robust Optimization
We propose a novel method for semi-supervised learning (SSL) based on
data-driven distributionally robust optimization (DRO) using optimal transport
metrics. Our proposed method enhances generalization error by using the
unlabeled data to restrict the support of the worst case distribution in our
DRO formulation. We enable the implementation of our DRO formulation by
proposing a stochastic gradient descent algorithm which allows to easily
implement the training procedure. We demonstrate that our Semi-supervised DRO
method is able to improve the generalization error over natural supervised
procedures and state-of-the-art SSL estimators. Finally, we include a
discussion on the large sample behavior of the optimal uncertainty region in
the DRO formulation. Our discussion exposes important aspects such as the role
of dimension reduction in SSL
Generative Adversarial Positive-Unlabelled Learning
In this work, we consider the task of classifying binary positive-unlabeled
(PU) data. The existing discriminative learning based PU models attempt to seek
an optimal reweighting strategy for U data, so that a decent decision boundary
can be found. However, given limited P data, the conventional PU models tend to
suffer from overfitting when adapted to very flexible deep neural networks. In
contrast, we are the first to innovate a totally new paradigm to attack the
binary PU task, from perspective of generative learning by leveraging the
powerful generative adversarial networks (GAN). Our generative
positive-unlabeled (GenPU) framework incorporates an array of discriminators
and generators that are endowed with different roles in simultaneously
producing positive and negative realistic samples. We provide theoretical
analysis to justify that, at equilibrium, GenPU is capable of recovering both
positive and negative data distributions. Moreover, we show GenPU is
generalizable and closely related to the semi-supervised classification. Given
rather limited P data, experiments on both synthetic and real-world dataset
demonstrate the effectiveness of our proposed framework. With infinite
realistic and diverse sample streams generated from GenPU, a very flexible
classifier can then be trained using deep neural networks.Comment: 8 page
- …