1,051 research outputs found
Accuracy of Latent-Variable Estimation in Bayesian Semi-Supervised Learning
Hierarchical probabilistic models, such as Gaussian mixture models, are
widely used for unsupervised learning tasks. These models consist of observable
and latent variables, which represent the observable data and the underlying
data-generation process, respectively. Unsupervised learning tasks, such as
cluster analysis, are regarded as estimations of latent variables based on the
observable ones. The estimation of latent variables in semi-supervised
learning, where some labels are observed, will be more precise than that in
unsupervised, and one of the concerns is to clarify the effect of the labeled
data. However, there has not been sufficient theoretical analysis of the
accuracy of the estimation of latent variables. In a previous study, a
distribution-based error function was formulated, and its asymptotic form was
calculated for unsupervised learning with generative models. It has been shown
that, for the estimation of latent variables, the Bayes method is more accurate
than the maximum-likelihood method. The present paper reveals the asymptotic
forms of the error function in Bayesian semi-supervised learning for both
discriminative and generative models. The results show that the generative
model, which uses all of the given data, performs better when the model is well
specified.Comment: 25 pages, 4 figure
Asymptotic Accuracy of Bayesian Estimation for a Single Latent Variable
In data science and machine learning, hierarchical parametric models, such as
mixture models, are often used. They contain two kinds of variables: observable
variables, which represent the parts of the data that can be directly measured,
and latent variables, which represent the underlying processes that generate
the data. Although there has been an increase in research on the estimation
accuracy for observable variables, the theoretical analysis of estimating
latent variables has not been thoroughly investigated. In a previous study, we
determined the accuracy of a Bayes estimation for the joint probability of the
latent variables in a dataset, and we proved that the Bayes method is
asymptotically more accurate than the maximum-likelihood method. However, the
accuracy of the Bayes estimation for a single latent variable remains unknown.
In the present paper, we derive the asymptotic expansions of the error
functions, which are defined by the Kullback-Leibler divergence, for two types
of single-variable estimations when the statistical regularity is satisfied.
Our results indicate that the accuracies of the Bayes and maximum-likelihood
methods are asymptotically equivalent and clarify that the Bayes method is only
advantageous for multivariable estimations.Comment: 28 pages, 3 figure
- …