76,941 research outputs found
Learning Arbitrary Statistical Mixtures of Discrete Distributions
We study the problem of learning from unlabeled samples very general
statistical mixture models on large finite sets. Specifically, the model to be
learned, , is a probability distribution over probability
distributions , where each such is a probability distribution over . When we sample from , we do not observe
directly, but only indirectly and in very noisy fashion, by sampling from
repeatedly, independently times from the distribution . The problem is
to infer to high accuracy in transportation (earthmover) distance.
We give the first efficient algorithms for learning this mixture model
without making any restricting assumptions on the structure of the distribution
. We bound the quality of the solution as a function of the size of
the samples and the number of samples used. Our model and results have
applications to a variety of unsupervised learning scenarios, including
learning topic models and collaborative filtering.Comment: 23 pages. Preliminary version in the Proceeding of the 47th ACM
Symposium on the Theory of Computing (STOC15
Learning Topic Models and Latent Bayesian Networks Under Expansion Constraints
Unsupervised estimation of latent variable models is a fundamental problem
central to numerous applications of machine learning and statistics. This work
presents a principled approach for estimating broad classes of such models,
including probabilistic topic models and latent linear Bayesian networks, using
only second-order observed moments. The sufficient conditions for
identifiability of these models are primarily based on weak expansion
constraints on the topic-word matrix, for topic models, and on the directed
acyclic graph, for Bayesian networks. Because no assumptions are made on the
distribution among the latent variables, the approach can handle arbitrary
correlations among the topics or latent factors. In addition, a tractable
learning method via optimization is proposed and studied in numerical
experiments.Comment: 38 pages, 6 figures, 2 tables, applications in topic models and
Bayesian networks are studied. Simulation section is adde
Improving Variational Encoder-Decoders in Dialogue Generation
Variational encoder-decoders (VEDs) have shown promising results in dialogue
generation. However, the latent variable distributions are usually approximated
by a much simpler model than the powerful RNN structure used for encoding and
decoding, yielding the KL-vanishing problem and inconsistent training
objective. In this paper, we separate the training step into two phases: The
first phase learns to autoencode discrete texts into continuous embeddings,
from which the second phase learns to generalize latent representations by
reconstructing the encoded embedding. In this case, latent variables are
sampled by transforming Gaussian noise through multi-layer perceptrons and are
trained with a separate VED model, which has the potential of realizing a much
more flexible distribution. We compare our model with current popular models
and the experiment demonstrates substantial improvement in both metric-based
and human evaluations.Comment: Accepted by AAAI201
- …