207,822 research outputs found

    Parametric Inference for Biological Sequence Analysis

    Get PDF
    One of the major successes in computational biology has been the unification, using the graphical model formalism, of a multitude of algorithms for annotating and comparing biological sequences. Graphical models that have been applied towards these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems associated with different statistical models. This paper introduces the \emph{polytope propagation algorithm} for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models.Comment: 15 pages, 4 figures. See also companion paper "Tropical Geometry of Statistical Models" (q-bio.QM/0311009

    Sum-Product Graphical Models: a Graphical Model Perspective on Sum-Product Networks

    Get PDF
    The trade off between expressiveness of representation and tractability of inference is a key issue of probabilistic models. On the one hand, probabilistic Graphical Models (GMs) provide a high level representation of distributions, but exact inference with cyclic graphs is in general intractable. On the other hand, Sum-Product Networks (SPNs) allow tractable exact inference with probability distributions that are more complex than tractable GMs, but they employ a low level representation of the underlying distribution, which is much harder to read and interpret than in GMs. The objective of this thesis is to close this gap and to achieve simultaneously the high level representation of GMs and the efficiency of SPNs. To this aim, new models and procedures are introduced. We first investigate SPNs that include GMs as a submodule, obtaining a derivation of Expectation-Maximization for SPNs which is the first allowing to learn the GM part alongside the SPN parameters. Then, we introduce a new architecture called Sum-Product Graphical Model (SPGM). This new architecture is the first to combine the semantics of graphical models with the evaluation efficiency of SPNs: SPGMs always enable tractable inference using a class of models that incorporate context specific independence (like SPNs), and they provide a high-level model interpretation in terms of conditional independence assumptions and corresponding factorizations (like GMs). An algorithm for learning both the structure and the model parameters of SPGMs is also introduced. Finally, several applications that illustrate and empirically motivate the introduction of the new models are described. SPGMs are applied to real-world discrete density estimation datasets, to augment a graphical model for segmenting scans of the human retina and detecting local pathologies, and to model very large mixtures of Quadtrees for image denoising. Strong empirical results and novel application areas denote promise for future applications of SPGMs

    The Libra Toolkit for Probabilistic Models

    Full text link
    The Libra Toolkit is a collection of algorithms for learning and inference with discrete probabilistic models, including Bayesian networks, Markov networks, dependency networks, and sum-product networks. Compared to other toolkits, Libra places a greater emphasis on learning the structure of tractable models in which exact inference is efficient. It also includes a variety of algorithms for learning graphical models in which inference is potentially intractable, and for performing exact and approximate inference. Libra is released under a 2-clause BSD license to encourage broad use in academia and industry

    Tropical Geometry of Statistical Models

    Get PDF
    This paper presents a unified mathematical framework for inference in graphical models, building on the observation that graphical models are algebraic varieties. From this geometric viewpoint, observations generated from a model are coordinates of a point in the variety, and the sum-product algorithm is an efficient tool for evaluating specific coordinates. The question addressed here is how the solutions to various inference problems depend on the model parameters. The proposed answer is expressed in terms of tropical algebraic geometry. A key role is played by the Newton polytope of a statistical model. Our results are applied to the hidden Markov model and to the general Markov model on a binary tree.Comment: 14 pages, 3 figures. Major revision. Applications now in companion paper, "Parametric Inference for Biological Sequence Analysis
    • …
    corecore