3,998 research outputs found

    Semi-supervised Learning based on Distributionally Robust Optimization

    Full text link
    We propose a novel method for semi-supervised learning (SSL) based on data-driven distributionally robust optimization (DRO) using optimal transport metrics. Our proposed method enhances generalization error by using the unlabeled data to restrict the support of the worst case distribution in our DRO formulation. We enable the implementation of our DRO formulation by proposing a stochastic gradient descent algorithm which allows to easily implement the training procedure. We demonstrate that our Semi-supervised DRO method is able to improve the generalization error over natural supervised procedures and state-of-the-art SSL estimators. Finally, we include a discussion on the large sample behavior of the optimal uncertainty region in the DRO formulation. Our discussion exposes important aspects such as the role of dimension reduction in SSL

    Generative Adversarial Positive-Unlabelled Learning

    Full text link
    In this work, we consider the task of classifying binary positive-unlabeled (PU) data. The existing discriminative learning based PU models attempt to seek an optimal reweighting strategy for U data, so that a decent decision boundary can be found. However, given limited P data, the conventional PU models tend to suffer from overfitting when adapted to very flexible deep neural networks. In contrast, we are the first to innovate a totally new paradigm to attack the binary PU task, from perspective of generative learning by leveraging the powerful generative adversarial networks (GAN). Our generative positive-unlabeled (GenPU) framework incorporates an array of discriminators and generators that are endowed with different roles in simultaneously producing positive and negative realistic samples. We provide theoretical analysis to justify that, at equilibrium, GenPU is capable of recovering both positive and negative data distributions. Moreover, we show GenPU is generalizable and closely related to the semi-supervised classification. Given rather limited P data, experiments on both synthetic and real-world dataset demonstrate the effectiveness of our proposed framework. With infinite realistic and diverse sample streams generated from GenPU, a very flexible classifier can then be trained using deep neural networks.Comment: 8 page
    corecore