6,608 research outputs found
Learning Topic Models and Latent Bayesian Networks Under Expansion Constraints
Unsupervised estimation of latent variable models is a fundamental problem
central to numerous applications of machine learning and statistics. This work
presents a principled approach for estimating broad classes of such models,
including probabilistic topic models and latent linear Bayesian networks, using
only second-order observed moments. The sufficient conditions for
identifiability of these models are primarily based on weak expansion
constraints on the topic-word matrix, for topic models, and on the directed
acyclic graph, for Bayesian networks. Because no assumptions are made on the
distribution among the latent variables, the approach can handle arbitrary
correlations among the topics or latent factors. In addition, a tractable
learning method via optimization is proposed and studied in numerical
experiments.Comment: 38 pages, 6 figures, 2 tables, applications in topic models and
Bayesian networks are studied. Simulation section is adde
Latent tree models
Latent tree models are graphical models defined on trees, in which only a
subset of variables is observed. They were first discussed by Judea Pearl as
tree-decomposable distributions to generalise star-decomposable distributions
such as the latent class model. Latent tree models, or their submodels, are
widely used in: phylogenetic analysis, network tomography, computer vision,
causal modeling, and data clustering. They also contain other well-known
classes of models like hidden Markov models, Brownian motion tree model, the
Ising model on a tree, and many popular models used in phylogenetics. This
article offers a concise introduction to the theory of latent tree models. We
emphasise the role of tree metrics in the structural description of this model
class, in designing learning algorithms, and in understanding fundamental
limits of what and when can be learned
Tree cumulants and the geometry of binary tree models
In this paper we investigate undirected discrete graphical tree models when
all the variables in the system are binary, where leaves represent the
observable variables and where all the inner nodes are unobserved. A novel
approach based on the theory of partially ordered sets allows us to obtain a
convenient parametrization of this model class. The construction of the
proposed coordinate system mirrors the combinatorial definition of cumulants. A
simple product-like form of the resulting parametrization gives insight into
identifiability issues associated with this model class. In particular, we
provide necessary and sufficient conditions for such a model to be identified
up to the switching of labels of the inner nodes. When these conditions hold,
we give explicit formulas for the parameters of the model. Whenever the model
fails to be identified, we use the new parametrization to describe the geometry
of the unidentified parameter space. We illustrate these results using a simple
example.Comment: Published in at http://dx.doi.org/10.3150/10-BEJ338 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Calibration of conditional composite likelihood for Bayesian inference on Gibbs random fields
Gibbs random fields play an important role in statistics, however, the
resulting likelihood is typically unavailable due to an intractable normalizing
constant. Composite likelihoods offer a principled means to construct useful
approximations. This paper provides a mean to calibrate the posterior
distribution resulting from using a composite likelihood and illustrate its
performance in several examples.Comment: JMLR Workshop and Conference Proceedings, 18th International
Conference on Artificial Intelligence and Statistics (AISTATS), San Diego,
California, USA, 9-12 May 2015 (Vol. 38, pp. 921-929). arXiv admin note:
substantial text overlap with arXiv:1207.575
The Dependence of Routine Bayesian Model Selection Methods on Irrelevant Alternatives
Bayesian methods - either based on Bayes Factors or BIC - are now widely used
for model selection. One property that might reasonably be demanded of any
model selection method is that if a model is preferred to a model
, when these two models are expressed as members of one model class
, this preference is preserved when they are embedded in a
different class . However, we illustrate in this paper that with
the usual implementation of these common Bayesian procedures this property does
not hold true even approximately. We therefore contend that to use these
methods it is first necessary for there to exist a "natural" embedding class.
We argue that in any context like the one illustrated in our running example of
Bayesian model selection of binary phylogenetic trees there is no such
embedding
- …