207,822 research outputs found
Parametric Inference for Biological Sequence Analysis
One of the major successes in computational biology has been the unification,
using the graphical model formalism, of a multitude of algorithms for
annotating and comparing biological sequences. Graphical models that have been
applied towards these problems include hidden Markov models for annotation,
tree models for phylogenetics, and pair hidden Markov models for alignment. A
single algorithm, the sum-product algorithm, solves many of the inference
problems associated with different statistical models. This paper introduces
the \emph{polytope propagation algorithm} for computing the Newton polytope of
an observation from a graphical model. This algorithm is a geometric version of
the sum-product algorithm and is used to analyze the parametric behavior of
maximum a posteriori inference calculations for graphical models.Comment: 15 pages, 4 figures. See also companion paper "Tropical Geometry of
Statistical Models" (q-bio.QM/0311009
Sum-Product Graphical Models: a Graphical Model Perspective on Sum-Product Networks
The trade off between expressiveness of representation and tractability of inference is a key issue of probabilistic models. On the one hand, probabilistic Graphical Models (GMs) provide a high level representation of distributions, but exact inference with cyclic graphs is in general intractable. On the other hand, Sum-Product Networks (SPNs) allow tractable exact inference with probability distributions that are more complex than tractable GMs, but they employ a low level representation of the underlying distribution, which is much harder to read and interpret than in GMs.
The objective of this thesis is to close this gap and to achieve simultaneously the high level representation of GMs and the efficiency of SPNs. To this aim, new models and procedures are introduced.
We first investigate SPNs that include GMs as a submodule, obtaining a derivation of Expectation-Maximization for SPNs which is the first allowing to learn the GM part alongside the SPN parameters.
Then, we introduce a new architecture called Sum-Product Graphical Model (SPGM). This new architecture is the first to combine the semantics of graphical models with the evaluation efficiency of SPNs: SPGMs always enable tractable inference using a class of models that incorporate context specific independence (like SPNs), and they provide a high-level model interpretation in terms of conditional independence assumptions and corresponding factorizations (like GMs). An algorithm for learning both the structure and the model parameters of SPGMs is also introduced.
Finally, several applications that illustrate and empirically motivate the introduction of the new models are described. SPGMs are applied to real-world discrete density estimation datasets, to augment a graphical model for segmenting scans of the human retina and detecting local pathologies, and to model very large mixtures of Quadtrees for image denoising. Strong empirical results and novel application areas denote promise for future applications of SPGMs
The Libra Toolkit for Probabilistic Models
The Libra Toolkit is a collection of algorithms for learning and inference
with discrete probabilistic models, including Bayesian networks, Markov
networks, dependency networks, and sum-product networks. Compared to other
toolkits, Libra places a greater emphasis on learning the structure of
tractable models in which exact inference is efficient. It also includes a
variety of algorithms for learning graphical models in which inference is
potentially intractable, and for performing exact and approximate inference.
Libra is released under a 2-clause BSD license to encourage broad use in
academia and industry
Tropical Geometry of Statistical Models
This paper presents a unified mathematical framework for inference in
graphical models, building on the observation that graphical models are
algebraic varieties.
From this geometric viewpoint, observations generated from a model are
coordinates of a point in the variety, and the sum-product algorithm is an
efficient tool for evaluating specific coordinates. The question addressed here
is how the solutions to various inference problems depend on the model
parameters. The proposed answer is expressed in terms of tropical algebraic
geometry. A key role is played by the Newton polytope of a statistical model.
Our results are applied to the hidden Markov model and to the general Markov
model on a binary tree.Comment: 14 pages, 3 figures. Major revision. Applications now in companion
paper, "Parametric Inference for Biological Sequence Analysis
- …