227 research outputs found
Extending Factor Graphs so as to Unify Directed and Undirected Graphical Models
The two most popular types of graphical model are directed models (Bayesian
networks) and undirected models (Markov random fields, or MRFs). Directed and
undirected models offer complementary properties in model construction,
expressing conditional independencies, expressing arbitrary factorizations of
joint distributions, and formulating message-passing inference algorithms. We
show that the strengths of these two representations can be combined in a
single type of graphical model called a 'factor graph'. Every Bayesian network
or MRF can be easily converted to a factor graph that expresses the same
conditional independencies, expresses the same factorization of the joint
distribution, and can be used for probabilistic inference through application
of a single, simple message-passing algorithm. In contrast to chain graphs,
where message-passing is implemented on a hypergraph, message-passing can be
directly implemented on the factor graph. We describe a modified 'Bayes-ball'
algorithm for establishing conditional independence in factor graphs, and we
show that factor graphs form a strict superset of Bayesian networks and MRFs.
In particular, we give an example of a commonly-used 'mixture of experts' model
fragment, whose independencies cannot be represented in a Bayesian network or
an MRF, but can be represented in a factor graph. We finish by giving examples
of real-world problems that are not well suited to representation in Bayesian
networks and MRFs, but are well-suited to representation in factor graphs.Comment: Appears in Proceedings of the Nineteenth Conference on Uncertainty in
Artificial Intelligence (UAI2003
Cumulative distribution networks and the derivative-sum-product algorithm
We introduce a new type of graphical model called a "cumulative distribution
network" (CDN), which expresses a joint cumulative distribution as a product of
local functions. Each local function can be viewed as providing evidence about
possible orderings, or rankings, of variables. Interestingly, we find that the
conditional independence properties of CDNs are quite different from other
graphical models. We also describe a messagepassing algorithm that efficiently
computes conditional cumulative distributions. Due to the unique independence
properties of the CDN, these messages do not in general have a one-to-one
correspondence with messages exchanged in standard algorithms, such as belief
propagation. We demonstrate the application of CDNs for structured ranking
learning using a previously-studied multi-player gaming dataset.Comment: Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty
in Artificial Intelligence (UAI2008
Learning Generative Models of Similarity Matrices
We describe a probabilistic (generative) view of affinity matrices along with
inference algorithms for a subclass of problems associated with data
clustering. This probabilistic view is helpful in understanding different
models and algorithms that are based on affinity functions OF the data. IN
particular, we show how(greedy) inference FOR a specific probabilistic model IS
equivalent TO the spectral clustering algorithm.It also provides a framework
FOR developing new algorithms AND extended models. AS one CASE, we present new
generative data clustering models that allow us TO infer the underlying
distance measure suitable for the clustering problem at hand. These models seem
to perform well in a larger class of problems for which other clustering
algorithms (including spectral clustering) usually fail. Experimental
evaluation was performed in a variety point data sets, showing excellent
performance.Comment: Appears in Proceedings of the Nineteenth Conference on Uncertainty in
Artificial Intelligence (UAI2003
Matrix Tile Analysis
Many tasks require finding groups of elements in a matrix of numbers, symbols
or class likelihoods. One approach is to use efficient bi- or tri-linear
factorization techniques including PCA, ICA, sparse matrix factorization and
plaid analysis. These techniques are not appropriate when addition and
multiplication of matrix elements are not sensibly defined. More directly,
methods like bi-clustering can be used to classify matrix elements, but these
methods make the overly-restrictive assumption that the class of each element
is a function of a row class and a column class. We introduce a general
computational problem, `matrix tile analysis' (MTA), which consists of
decomposing a matrix into a set of non-overlapping tiles, each of which is
defined by a subset of usually nonadjacent rows and columns. MTA does not
require an algebra for combining tiles, but must search over discrete
combinations of tile assignments. Exact MTA is a computationally intractable
integer programming problem, but we describe an approximate iterative technique
and a computationally efficient sum-product relaxation of the integer program.
We compare the effectiveness of these methods to PCA and plaid on hundreds of
randomly generated tasks. Using double-gene-knockout data, we show that MTA
finds groups of interacting yeast genes that have biologically-related
functions.Comment: Appears in Proceedings of the Twenty-Second Conference on Uncertainty
in Artificial Intelligence (UAI2006
A Factorized Variational Technique for Phase Unwrapping in Markov Random Fields
Some types of medical and topographic imaging device produce images in which
the pixel values are "phase-wrapped", i.e. measured modulus a known scalar.
Phase unwrapping can be viewed as the problem of inferring the number of shifts
between each and every pair of neighboring pixels, subject to an a priori
preference for smooth surfaces, and subject to a zero curl constraint, which
requires that the shifts must sum to 0 around every loop. We formulate phase
unwrapping as a mean field inference problem in a Markov network, where the
prior favors the zero curl constraint. We compare our mean field technique with
the least squares method on a synthetic 100x100 image, and give results on a
512x512 synthetic aperture radar image from Sandia National Laboratories.<Long
Text>Comment: Appears in Proceedings of the Seventeenth Conference on Uncertainty
in Artificial Intelligence (UAI2001
Variational Learning in Mixed-State Dynamic Graphical Models
Many real-valued stochastic time-series are locally linear (Gassian), but
globally non-linear. For example, the trajectory of a human hand gesture can be
viewed as a linear dynamic system driven by a nonlinear dynamic system that
represents muscle actions. We present a mixed-state dynamic graphical model in
which a hidden Markov model drives a linear dynamic system. This combination
allows us to model both the discrete and continuous causes of trajectories such
as human gestures. The number of computations needed for exact inference is
exponential in the sequence length, so we derive an approximate variational
inference technique that can also be used to learn the parameters of the
discrete and continuous models. We show how the mixed-state model and the
variational technique can be used to classify human hand gestures made with a
computer mouse.Comment: Appears in Proceedings of the Fifteenth Conference on Uncertainty in
Artificial Intelligence (UAI1999
Generating and designing DNA with deep generative models
We propose generative neural network methods to generate DNA sequences and
tune them to have desired properties. We present three approaches: creating
synthetic DNA sequences using a generative adversarial network; a DNA-based
variant of the activation maximization ("deep dream") design method; and a
joint procedure which combines these two approaches together. We show that
these tools capture important structures of the data and, when applied to
designing probes for protein binding microarrays, allow us to generate new
sequences whose properties are estimated to be superior to those found in the
training data. We believe that these results open the door for applying deep
generative models to advance genomics research.Comment: NIPS 2017 Computational Biology Worksho
Interpreting Graph Cuts as a Max-Product Algorithm
The maximum a posteriori (MAP) configuration of binary variable models with
submodular graph-structured energy functions can be found efficiently and
exactly by graph cuts. Max-product belief propagation (MP) has been shown to be
suboptimal on this class of energy functions by a canonical counterexample
where MP converges to a suboptimal fixed point (Kulesza & Pereira, 2008).
In this work, we show that under a particular scheduling and damping scheme,
MP is equivalent to graph cuts, and thus optimal. We explain the apparent
contradiction by showing that with proper scheduling and damping, MP always
converges to an optimal fixed point. Thus, the canonical counterexample only
shows the suboptimality of MP with a particular suboptimal choice of schedule
and damping. With proper choices, MP is optimal
Hierarchical Affinity Propagation
Affinity propagation is an exemplar-based clustering algorithm that finds a
set of data-points that best exemplify the data, and associates each datapoint
with one exemplar. We extend affinity propagation in a principled way to solve
the hierarchical clustering problem, which arises in a variety of domains
including biology, sensor networks and decision making in operational research.
We derive an inference algorithm that operates by propagating information up
and down the hierarchy, and is efficient despite the high-order potentials
required for the graphical model formulation. We demonstrate that our method
outperforms greedy techniques that cluster one layer at a time. We show that on
an artificial dataset designed to mimic the HIV-strain mutation dynamics, our
method outperforms related methods. For real HIV sequences, where the ground
truth is not available, we show our method achieves better results, in terms of
the underlying objective function, and show the results correspond meaningfully
to geographical location and strain subtypes. Finally we report results on
using the method for the analysis of mass spectra, showing it performs
favorably compared to state-of-the-art methods
Convolutional Factor Graphs as Probabilistic Models
Based on a recent development in the area of error control coding, we
introduce the notion of convolutional factor graphs (CFGs) as a new class of
probabilistic graphical models. In this context, the conventional factor graphs
are referred to as multiplicative factor graphs (MFGs). This paper shows that
CFGs are natural models for probability functions when summation of independent
latent random variables is involved. In particular, CFGs capture a large class
of linear models, where the linearity is in the sense that the observed
variables are obtained as a linear ransformation of the latent variables taking
arbitrary distributions. We use Gaussian models and independent factor models
as examples to emonstrate the use of CFGs. The requirement of a linear
transformation between latent variables (with certain independence restriction)
and the bserved variables, to an extent, limits the modelling flexibility of
CFGs. This structural restriction however provides a powerful analytic tool to
the framework of CFGs; that is, upon taking the Fourier transform of the
function represented by the CFG, the resulting function is represented by a FG
with identical structure. This Fourier transform duality allows inference
problems on a CFG to be solved on the corresponding dual MFG.Comment: Appears in Proceedings of the Twentieth Conference on Uncertainty in
Artificial Intelligence (UAI2004
- …