227 research outputs found

    Extending Factor Graphs so as to Unify Directed and Undirected Graphical Models

    Full text link
    The two most popular types of graphical model are directed models (Bayesian networks) and undirected models (Markov random fields, or MRFs). Directed and undirected models offer complementary properties in model construction, expressing conditional independencies, expressing arbitrary factorizations of joint distributions, and formulating message-passing inference algorithms. We show that the strengths of these two representations can be combined in a single type of graphical model called a 'factor graph'. Every Bayesian network or MRF can be easily converted to a factor graph that expresses the same conditional independencies, expresses the same factorization of the joint distribution, and can be used for probabilistic inference through application of a single, simple message-passing algorithm. In contrast to chain graphs, where message-passing is implemented on a hypergraph, message-passing can be directly implemented on the factor graph. We describe a modified 'Bayes-ball' algorithm for establishing conditional independence in factor graphs, and we show that factor graphs form a strict superset of Bayesian networks and MRFs. In particular, we give an example of a commonly-used 'mixture of experts' model fragment, whose independencies cannot be represented in a Bayesian network or an MRF, but can be represented in a factor graph. We finish by giving examples of real-world problems that are not well suited to representation in Bayesian networks and MRFs, but are well-suited to representation in factor graphs.Comment: Appears in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003

    Cumulative distribution networks and the derivative-sum-product algorithm

    Full text link
    We introduce a new type of graphical model called a "cumulative distribution network" (CDN), which expresses a joint cumulative distribution as a product of local functions. Each local function can be viewed as providing evidence about possible orderings, or rankings, of variables. Interestingly, we find that the conditional independence properties of CDNs are quite different from other graphical models. We also describe a messagepassing algorithm that efficiently computes conditional cumulative distributions. Due to the unique independence properties of the CDN, these messages do not in general have a one-to-one correspondence with messages exchanged in standard algorithms, such as belief propagation. We demonstrate the application of CDNs for structured ranking learning using a previously-studied multi-player gaming dataset.Comment: Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI2008

    Learning Generative Models of Similarity Matrices

    Full text link
    We describe a probabilistic (generative) view of affinity matrices along with inference algorithms for a subclass of problems associated with data clustering. This probabilistic view is helpful in understanding different models and algorithms that are based on affinity functions OF the data. IN particular, we show how(greedy) inference FOR a specific probabilistic model IS equivalent TO the spectral clustering algorithm.It also provides a framework FOR developing new algorithms AND extended models. AS one CASE, we present new generative data clustering models that allow us TO infer the underlying distance measure suitable for the clustering problem at hand. These models seem to perform well in a larger class of problems for which other clustering algorithms (including spectral clustering) usually fail. Experimental evaluation was performed in a variety point data sets, showing excellent performance.Comment: Appears in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003

    Matrix Tile Analysis

    Full text link
    Many tasks require finding groups of elements in a matrix of numbers, symbols or class likelihoods. One approach is to use efficient bi- or tri-linear factorization techniques including PCA, ICA, sparse matrix factorization and plaid analysis. These techniques are not appropriate when addition and multiplication of matrix elements are not sensibly defined. More directly, methods like bi-clustering can be used to classify matrix elements, but these methods make the overly-restrictive assumption that the class of each element is a function of a row class and a column class. We introduce a general computational problem, `matrix tile analysis' (MTA), which consists of decomposing a matrix into a set of non-overlapping tiles, each of which is defined by a subset of usually nonadjacent rows and columns. MTA does not require an algebra for combining tiles, but must search over discrete combinations of tile assignments. Exact MTA is a computationally intractable integer programming problem, but we describe an approximate iterative technique and a computationally efficient sum-product relaxation of the integer program. We compare the effectiveness of these methods to PCA and plaid on hundreds of randomly generated tasks. Using double-gene-knockout data, we show that MTA finds groups of interacting yeast genes that have biologically-related functions.Comment: Appears in Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI2006

    A Factorized Variational Technique for Phase Unwrapping in Markov Random Fields

    Full text link
    Some types of medical and topographic imaging device produce images in which the pixel values are "phase-wrapped", i.e. measured modulus a known scalar. Phase unwrapping can be viewed as the problem of inferring the number of shifts between each and every pair of neighboring pixels, subject to an a priori preference for smooth surfaces, and subject to a zero curl constraint, which requires that the shifts must sum to 0 around every loop. We formulate phase unwrapping as a mean field inference problem in a Markov network, where the prior favors the zero curl constraint. We compare our mean field technique with the least squares method on a synthetic 100x100 image, and give results on a 512x512 synthetic aperture radar image from Sandia National Laboratories.<Long Text>Comment: Appears in Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI2001

    Variational Learning in Mixed-State Dynamic Graphical Models

    Full text link
    Many real-valued stochastic time-series are locally linear (Gassian), but globally non-linear. For example, the trajectory of a human hand gesture can be viewed as a linear dynamic system driven by a nonlinear dynamic system that represents muscle actions. We present a mixed-state dynamic graphical model in which a hidden Markov model drives a linear dynamic system. This combination allows us to model both the discrete and continuous causes of trajectories such as human gestures. The number of computations needed for exact inference is exponential in the sequence length, so we derive an approximate variational inference technique that can also be used to learn the parameters of the discrete and continuous models. We show how the mixed-state model and the variational technique can be used to classify human hand gestures made with a computer mouse.Comment: Appears in Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI1999

    Generating and designing DNA with deep generative models

    Full text link
    We propose generative neural network methods to generate DNA sequences and tune them to have desired properties. We present three approaches: creating synthetic DNA sequences using a generative adversarial network; a DNA-based variant of the activation maximization ("deep dream") design method; and a joint procedure which combines these two approaches together. We show that these tools capture important structures of the data and, when applied to designing probes for protein binding microarrays, allow us to generate new sequences whose properties are estimated to be superior to those found in the training data. We believe that these results open the door for applying deep generative models to advance genomics research.Comment: NIPS 2017 Computational Biology Worksho

    Interpreting Graph Cuts as a Max-Product Algorithm

    Full text link
    The maximum a posteriori (MAP) configuration of binary variable models with submodular graph-structured energy functions can be found efficiently and exactly by graph cuts. Max-product belief propagation (MP) has been shown to be suboptimal on this class of energy functions by a canonical counterexample where MP converges to a suboptimal fixed point (Kulesza & Pereira, 2008). In this work, we show that under a particular scheduling and damping scheme, MP is equivalent to graph cuts, and thus optimal. We explain the apparent contradiction by showing that with proper scheduling and damping, MP always converges to an optimal fixed point. Thus, the canonical counterexample only shows the suboptimality of MP with a particular suboptimal choice of schedule and damping. With proper choices, MP is optimal

    Hierarchical Affinity Propagation

    Full text link
    Affinity propagation is an exemplar-based clustering algorithm that finds a set of data-points that best exemplify the data, and associates each datapoint with one exemplar. We extend affinity propagation in a principled way to solve the hierarchical clustering problem, which arises in a variety of domains including biology, sensor networks and decision making in operational research. We derive an inference algorithm that operates by propagating information up and down the hierarchy, and is efficient despite the high-order potentials required for the graphical model formulation. We demonstrate that our method outperforms greedy techniques that cluster one layer at a time. We show that on an artificial dataset designed to mimic the HIV-strain mutation dynamics, our method outperforms related methods. For real HIV sequences, where the ground truth is not available, we show our method achieves better results, in terms of the underlying objective function, and show the results correspond meaningfully to geographical location and strain subtypes. Finally we report results on using the method for the analysis of mass spectra, showing it performs favorably compared to state-of-the-art methods

    Convolutional Factor Graphs as Probabilistic Models

    Full text link
    Based on a recent development in the area of error control coding, we introduce the notion of convolutional factor graphs (CFGs) as a new class of probabilistic graphical models. In this context, the conventional factor graphs are referred to as multiplicative factor graphs (MFGs). This paper shows that CFGs are natural models for probability functions when summation of independent latent random variables is involved. In particular, CFGs capture a large class of linear models, where the linearity is in the sense that the observed variables are obtained as a linear ransformation of the latent variables taking arbitrary distributions. We use Gaussian models and independent factor models as examples to emonstrate the use of CFGs. The requirement of a linear transformation between latent variables (with certain independence restriction) and the bserved variables, to an extent, limits the modelling flexibility of CFGs. This structural restriction however provides a powerful analytic tool to the framework of CFGs; that is, upon taking the Fourier transform of the function represented by the CFG, the resulting function is represented by a FG with identical structure. This Fourier transform duality allows inference problems on a CFG to be solved on the corresponding dual MFG.Comment: Appears in Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence (UAI2004
    • …
    corecore