119,453 research outputs found
Accuracy of Latent-Variable Estimation in Bayesian Semi-Supervised Learning
Hierarchical probabilistic models, such as Gaussian mixture models, are
widely used for unsupervised learning tasks. These models consist of observable
and latent variables, which represent the observable data and the underlying
data-generation process, respectively. Unsupervised learning tasks, such as
cluster analysis, are regarded as estimations of latent variables based on the
observable ones. The estimation of latent variables in semi-supervised
learning, where some labels are observed, will be more precise than that in
unsupervised, and one of the concerns is to clarify the effect of the labeled
data. However, there has not been sufficient theoretical analysis of the
accuracy of the estimation of latent variables. In a previous study, a
distribution-based error function was formulated, and its asymptotic form was
calculated for unsupervised learning with generative models. It has been shown
that, for the estimation of latent variables, the Bayes method is more accurate
than the maximum-likelihood method. The present paper reveals the asymptotic
forms of the error function in Bayesian semi-supervised learning for both
discriminative and generative models. The results show that the generative
model, which uses all of the given data, performs better when the model is well
specified.Comment: 25 pages, 4 figure
A random matrix analysis and improvement of semi-supervised learning for large dimensional data
This article provides an original understanding of the behavior of a class of
graph-oriented semi-supervised learning algorithms in the limit of large and
numerous data. It is demonstrated that the intuition at the root of these
methods collapses in this limit and that, as a result, most of them become
inconsistent. Corrective measures and a new data-driven parametrization scheme
are proposed along with a theoretical analysis of the asymptotic performances
of the resulting approach. A surprisingly close behavior between theoretical
performances on Gaussian mixture models and on real datasets is also
illustrated throughout the article, thereby suggesting the importance of the
proposed analysis for dealing with practical data. As a result, significant
performance gains are observed on practical data classification using the
proposed parametrization
Interpolation Consistency Training for Semi-Supervised Learning
We introduce Interpolation Consistency Training (ICT), a simple and
computation efficient algorithm for training Deep Neural Networks in the
semi-supervised learning paradigm. ICT encourages the prediction at an
interpolation of unlabeled points to be consistent with the interpolation of
the predictions at those points. In classification problems, ICT moves the
decision boundary to low-density regions of the data distribution. Our
experiments show that ICT achieves state-of-the-art performance when applied to
standard neural network architectures on the CIFAR-10 and SVHN benchmark
datasets. Our theoretical analysis shows that ICT corresponds to a certain type
of data-adaptive regularization with unlabeled points which reduces overfitting
to labeled points under high confidence values.Comment: Extended version of IJCAI 2019 paper. Semi-supervised Learning, Deep
Learning, Neural Networks. All the previous results are unchanged; we added
new theoretical and empirical result
Consistent Semi-Supervised Graph Regularization for High Dimensional Data
Semi-supervised Laplacian regularization, a standard graph-based approach for
learning from both labelled and unlabelled data, was recently demonstrated to
have an insignificant high dimensional learning efficiency with respect to
unlabelled data (Mai and Couillet 2018), causing it to be outperformed by its
unsupervised counterpart, spectral clustering, given sufficient unlabelled
data. Following a detailed discussion on the origin of this inconsistency
problem, a novel regularization approach involving centering operation is
proposed as solution, supported by both theoretical analysis and empirical
results
Semi-supervised Community Detection via Structural Similarity Metrics
Motivated by social network analysis and network-based recommendation
systems, we study a semi-supervised community detection problem in which the
objective is to estimate the community label of a new node using the network
topology and partially observed community labels of existing nodes. The network
is modeled using a degree-corrected stochastic block model, which allows for
severe degree heterogeneity and potentially non-assortative communities. We
propose an algorithm that computes a `structural similarity metric' between the
new node and each of the communities by aggregating labeled and unlabeled
data. The estimated label of the new node corresponds to the value of that
maximizes this similarity metric. Our method is fast and numerically
outperforms existing semi-supervised algorithms. Theoretically, we derive
explicit bounds for the misclassification error and show the efficiency of our
method by comparing it with an ideal classifier. Our findings highlight, to the
best of our knowledge, the first semi-supervised community detection algorithm
that offers theoretical guarantees.Comment: 9 pages, 8 figures, accepted by the 11th International Conference on
Learning Representations (ICLR 2023
- …