270 research outputs found
A Factor Graph Approach to Automated Design of Bayesian Signal Processing Algorithms
The benefits of automating design cycles for Bayesian inference-based
algorithms are becoming increasingly recognized by the machine learning
community. As a result, interest in probabilistic programming frameworks has
much increased over the past few years. This paper explores a specific
probabilistic programming paradigm, namely message passing in Forney-style
factor graphs (FFGs), in the context of automated design of efficient Bayesian
signal processing algorithms. To this end, we developed "ForneyLab"
(https://github.com/biaslab/ForneyLab.jl) as a Julia toolbox for message
passing-based inference in FFGs. We show by example how ForneyLab enables
automatic derivation of Bayesian signal processing algorithms, including
algorithms for parameter estimation and model comparison. Crucially, due to the
modular makeup of the FFG framework, both the model specification and inference
methods are readily extensible in ForneyLab. In order to test this framework,
we compared variational message passing as implemented by ForneyLab with
automatic differentiation variational inference (ADVI) and Monte Carlo methods
as implemented by state-of-the-art tools "Edward" and "Stan". In terms of
performance, extensibility and stability issues, ForneyLab appears to enjoy an
edge relative to its competitors for automated inference in state-space models.Comment: Accepted for publication in the International Journal of Approximate
Reasonin
Particle Gibbs Split-Merge Sampling for Bayesian Inference in Mixture Models
This paper presents a new Markov chain Monte Carlo method to sample from the
posterior distribution of conjugate mixture models. This algorithm relies on a
flexible split-merge procedure built using the particle Gibbs sampler. Contrary
to available split-merge procedures, the resulting so-called Particle Gibbs
Split-Merge sampler does not require the computation of a complex acceptance
ratio, is simple to implement using existing sequential Monte Carlo libraries
and can be parallelized. We investigate its performance experimentally on
synthetic problems as well as on geolocation and cancer genomics data. In all
these examples, the particle Gibbs split-merge sampler outperforms
state-of-the-art split-merge methods by up to an order of magnitude for a fixed
computational complexity
Probabilistic Clustering of Time-Evolving Distance Data
We present a novel probabilistic clustering model for objects that are
represented via pairwise distances and observed at different time points. The
proposed method utilizes the information given by adjacent time points to find
the underlying cluster structure and obtain a smooth cluster evolution. This
approach allows the number of objects and clusters to differ at every time
point, and no identification on the identities of the objects is needed.
Further, the model does not require the number of clusters being specified in
advance -- they are instead determined automatically using a Dirichlet process
prior. We validate our model on synthetic data showing that the proposed method
is more accurate than state-of-the-art clustering methods. Finally, we use our
dynamic clustering model to analyze and illustrate the evolution of brain
cancer patients over time
Conjoined Dirichlet Process
Biclustering is a class of techniques that simultaneously clusters the rows
and columns of a matrix to sort heterogeneous data into homogeneous blocks.
Although many algorithms have been proposed to find biclusters, existing
methods suffer from the pre-specification of the number of biclusters or place
constraints on the model structure. To address these issues, we develop a
novel, non-parametric probabilistic biclustering method based on Dirichlet
processes to identify biclusters with strong co-occurrence in both rows and
columns. The proposed method utilizes dual Dirichlet process mixture models to
learn row and column clusters, with the number of resulting clusters determined
by the data rather than pre-specified. Probabilistic biclusters are identified
by modeling the mutual dependence between the row and column clusters. We apply
our method to two different applications, text mining and gene expression
analysis, and demonstrate that our method improves bicluster extraction in many
settings compared to existing approaches
- …