12,936 research outputs found

    Latent Factor Analysis of Gaussian Distributions under Graphical Constraints

    Full text link
    We explore the algebraic structure of the solution space of convex optimization problem Constrained Minimum Trace Factor Analysis (CMTFA), when the population covariance matrix Σx\Sigma_x has an additional latent graphical constraint, namely, a latent star topology. In particular, we have shown that CMTFA can have either a rank 1 1 or a rank n−1 n-1 solution and nothing in between. The special case of a rank 1 1 solution, corresponds to the case where just one latent variable captures all the dependencies among the observables, giving rise to a star topology. We found explicit conditions for both rank 1 1 and rank n−1n- 1 solutions for CMTFA solution of Σx\Sigma_x. As a basic attempt towards building a more general Gaussian tree, we have found a necessary and a sufficient condition for multiple clusters, each having rank 1 1 CMTFA solution, to satisfy a minimum probability to combine together to build a Gaussian tree. To support our analytical findings we have presented some numerical demonstrating the usefulness of the contributions of our work.Comment: 9 pages, 4 figure

    Latent tree models

    Full text link
    Latent tree models are graphical models defined on trees, in which only a subset of variables is observed. They were first discussed by Judea Pearl as tree-decomposable distributions to generalise star-decomposable distributions such as the latent class model. Latent tree models, or their submodels, are widely used in: phylogenetic analysis, network tomography, computer vision, causal modeling, and data clustering. They also contain other well-known classes of models like hidden Markov models, Brownian motion tree model, the Ising model on a tree, and many popular models used in phylogenetics. This article offers a concise introduction to the theory of latent tree models. We emphasise the role of tree metrics in the structural description of this model class, in designing learning algorithms, and in understanding fundamental limits of what and when can be learned

    Flexible sampling of discrete data correlations without the marginal distributions

    Get PDF
    Learning the joint dependence of discrete variables is a fundamental problem in machine learning, with many applications including prediction, clustering and dimensionality reduction. More recently, the framework of copula modeling has gained popularity due to its modular parametrization of joint distributions. Among other properties, copulas provide a recipe for combining flexible models for univariate marginal distributions with parametric families suitable for potentially high dimensional dependence structures. More radically, the extended rank likelihood approach of Hoff (2007) bypasses learning marginal models completely when such information is ancillary to the learning task at hand as in, e.g., standard dimensionality reduction problems or copula parameter estimation. The main idea is to represent data by their observable rank statistics, ignoring any other information from the marginals. Inference is typically done in a Bayesian framework with Gaussian copulas, and it is complicated by the fact this implies sampling within a space where the number of constraints increases quadratically with the number of data points. The result is slow mixing when using off-the-shelf Gibbs sampling. We present an efficient algorithm based on recent advances on constrained Hamiltonian Markov chain Monte Carlo that is simple to implement and does not require paying for a quadratic cost in sample size.Comment: An overhauled version of the experimental section moved to the main paper. Old experimental section moved to supplementary materia

    Binary Models for Marginal Independence

    Full text link
    Log-linear models are a classical tool for the analysis of contingency tables. In particular, the subclass of graphical log-linear models provides a general framework for modelling conditional independences. However, with the exception of special structures, marginal independence hypotheses cannot be accommodated by these traditional models. Focusing on binary variables, we present a model class that provides a framework for modelling marginal independences in contingency tables. The approach taken is graphical and draws on analogies to multivariate Gaussian models for marginal independence. For the graphical model representation we use bi-directed graphs, which are in the tradition of path diagrams. We show how the models can be parameterized in a simple fashion, and how maximum likelihood estimation can be performed using a version of the Iterated Conditional Fitting algorithm. Finally we consider combining these models with symmetry restrictions

    Star graphs induce tetrad correlations: for Gaussian as well as for binary variables

    Get PDF
    Tetrad correlations were obtained historically for Gaussian distributions when tasks are designed to measure an ability or attitude so that a single unobserved variable may generate the observed, linearly increasing dependences among the tasks. We connect such generating processes to a particular type of directed graph, the star graph, and to the notion of traceable regressions. Tetrad correlation conditions for the existence of a single latent variable are derived. These are needed for positive dependences not only in joint Gaussian but also in joint binary distributions. Three applications with binary items are given.Comment: 21 pages, 2 figures, 5 table
    • …
    corecore