12,936 research outputs found
Latent Factor Analysis of Gaussian Distributions under Graphical Constraints
We explore the algebraic structure of the solution space of convex
optimization problem Constrained Minimum Trace Factor Analysis (CMTFA), when
the population covariance matrix has an additional latent graphical
constraint, namely, a latent star topology. In particular, we have shown that
CMTFA can have either a rank or a rank solution and nothing in
between. The special case of a rank solution, corresponds to the case
where just one latent variable captures all the dependencies among the
observables, giving rise to a star topology. We found explicit conditions for
both rank and rank solutions for CMTFA solution of . As
a basic attempt towards building a more general Gaussian tree, we have found a
necessary and a sufficient condition for multiple clusters, each having rank CMTFA solution, to satisfy a minimum probability to combine together to
build a Gaussian tree. To support our analytical findings we have presented
some numerical demonstrating the usefulness of the contributions of our work.Comment: 9 pages, 4 figure
Latent tree models
Latent tree models are graphical models defined on trees, in which only a
subset of variables is observed. They were first discussed by Judea Pearl as
tree-decomposable distributions to generalise star-decomposable distributions
such as the latent class model. Latent tree models, or their submodels, are
widely used in: phylogenetic analysis, network tomography, computer vision,
causal modeling, and data clustering. They also contain other well-known
classes of models like hidden Markov models, Brownian motion tree model, the
Ising model on a tree, and many popular models used in phylogenetics. This
article offers a concise introduction to the theory of latent tree models. We
emphasise the role of tree metrics in the structural description of this model
class, in designing learning algorithms, and in understanding fundamental
limits of what and when can be learned
Flexible sampling of discrete data correlations without the marginal distributions
Learning the joint dependence of discrete variables is a fundamental problem
in machine learning, with many applications including prediction, clustering
and dimensionality reduction. More recently, the framework of copula modeling
has gained popularity due to its modular parametrization of joint
distributions. Among other properties, copulas provide a recipe for combining
flexible models for univariate marginal distributions with parametric families
suitable for potentially high dimensional dependence structures. More
radically, the extended rank likelihood approach of Hoff (2007) bypasses
learning marginal models completely when such information is ancillary to the
learning task at hand as in, e.g., standard dimensionality reduction problems
or copula parameter estimation. The main idea is to represent data by their
observable rank statistics, ignoring any other information from the marginals.
Inference is typically done in a Bayesian framework with Gaussian copulas, and
it is complicated by the fact this implies sampling within a space where the
number of constraints increases quadratically with the number of data points.
The result is slow mixing when using off-the-shelf Gibbs sampling. We present
an efficient algorithm based on recent advances on constrained Hamiltonian
Markov chain Monte Carlo that is simple to implement and does not require
paying for a quadratic cost in sample size.Comment: An overhauled version of the experimental section moved to the main
paper. Old experimental section moved to supplementary materia
Binary Models for Marginal Independence
Log-linear models are a classical tool for the analysis of contingency
tables. In particular, the subclass of graphical log-linear models provides a
general framework for modelling conditional independences. However, with the
exception of special structures, marginal independence hypotheses cannot be
accommodated by these traditional models. Focusing on binary variables, we
present a model class that provides a framework for modelling marginal
independences in contingency tables. The approach taken is graphical and draws
on analogies to multivariate Gaussian models for marginal independence. For the
graphical model representation we use bi-directed graphs, which are in the
tradition of path diagrams. We show how the models can be parameterized in a
simple fashion, and how maximum likelihood estimation can be performed using a
version of the Iterated Conditional Fitting algorithm. Finally we consider
combining these models with symmetry restrictions
Star graphs induce tetrad correlations: for Gaussian as well as for binary variables
Tetrad correlations were obtained historically for Gaussian distributions
when tasks are designed to measure an ability or attitude so that a single
unobserved variable may generate the observed, linearly increasing dependences
among the tasks. We connect such generating processes to a particular type of
directed graph, the star graph, and to the notion of traceable regressions.
Tetrad correlation conditions for the existence of a single latent variable are
derived. These are needed for positive dependences not only in joint Gaussian
but also in joint binary distributions. Three applications with binary items
are given.Comment: 21 pages, 2 figures, 5 table
- …