119,453 research outputs found

    Accuracy of Latent-Variable Estimation in Bayesian Semi-Supervised Learning

    Full text link
    Hierarchical probabilistic models, such as Gaussian mixture models, are widely used for unsupervised learning tasks. These models consist of observable and latent variables, which represent the observable data and the underlying data-generation process, respectively. Unsupervised learning tasks, such as cluster analysis, are regarded as estimations of latent variables based on the observable ones. The estimation of latent variables in semi-supervised learning, where some labels are observed, will be more precise than that in unsupervised, and one of the concerns is to clarify the effect of the labeled data. However, there has not been sufficient theoretical analysis of the accuracy of the estimation of latent variables. In a previous study, a distribution-based error function was formulated, and its asymptotic form was calculated for unsupervised learning with generative models. It has been shown that, for the estimation of latent variables, the Bayes method is more accurate than the maximum-likelihood method. The present paper reveals the asymptotic forms of the error function in Bayesian semi-supervised learning for both discriminative and generative models. The results show that the generative model, which uses all of the given data, performs better when the model is well specified.Comment: 25 pages, 4 figure

    A random matrix analysis and improvement of semi-supervised learning for large dimensional data

    Full text link
    This article provides an original understanding of the behavior of a class of graph-oriented semi-supervised learning algorithms in the limit of large and numerous data. It is demonstrated that the intuition at the root of these methods collapses in this limit and that, as a result, most of them become inconsistent. Corrective measures and a new data-driven parametrization scheme are proposed along with a theoretical analysis of the asymptotic performances of the resulting approach. A surprisingly close behavior between theoretical performances on Gaussian mixture models and on real datasets is also illustrated throughout the article, thereby suggesting the importance of the proposed analysis for dealing with practical data. As a result, significant performance gains are observed on practical data classification using the proposed parametrization

    Interpolation Consistency Training for Semi-Supervised Learning

    Full text link
    We introduce Interpolation Consistency Training (ICT), a simple and computation efficient algorithm for training Deep Neural Networks in the semi-supervised learning paradigm. ICT encourages the prediction at an interpolation of unlabeled points to be consistent with the interpolation of the predictions at those points. In classification problems, ICT moves the decision boundary to low-density regions of the data distribution. Our experiments show that ICT achieves state-of-the-art performance when applied to standard neural network architectures on the CIFAR-10 and SVHN benchmark datasets. Our theoretical analysis shows that ICT corresponds to a certain type of data-adaptive regularization with unlabeled points which reduces overfitting to labeled points under high confidence values.Comment: Extended version of IJCAI 2019 paper. Semi-supervised Learning, Deep Learning, Neural Networks. All the previous results are unchanged; we added new theoretical and empirical result

    Consistent Semi-Supervised Graph Regularization for High Dimensional Data

    Full text link
    Semi-supervised Laplacian regularization, a standard graph-based approach for learning from both labelled and unlabelled data, was recently demonstrated to have an insignificant high dimensional learning efficiency with respect to unlabelled data (Mai and Couillet 2018), causing it to be outperformed by its unsupervised counterpart, spectral clustering, given sufficient unlabelled data. Following a detailed discussion on the origin of this inconsistency problem, a novel regularization approach involving centering operation is proposed as solution, supported by both theoretical analysis and empirical results

    Semi-supervised Community Detection via Structural Similarity Metrics

    Full text link
    Motivated by social network analysis and network-based recommendation systems, we study a semi-supervised community detection problem in which the objective is to estimate the community label of a new node using the network topology and partially observed community labels of existing nodes. The network is modeled using a degree-corrected stochastic block model, which allows for severe degree heterogeneity and potentially non-assortative communities. We propose an algorithm that computes a `structural similarity metric' between the new node and each of the KK communities by aggregating labeled and unlabeled data. The estimated label of the new node corresponds to the value of kk that maximizes this similarity metric. Our method is fast and numerically outperforms existing semi-supervised algorithms. Theoretically, we derive explicit bounds for the misclassification error and show the efficiency of our method by comparing it with an ideal classifier. Our findings highlight, to the best of our knowledge, the first semi-supervised community detection algorithm that offers theoretical guarantees.Comment: 9 pages, 8 figures, accepted by the 11th International Conference on Learning Representations (ICLR 2023
    corecore