20 research outputs found
Probabilistic abductive logic programming using Dirichlet priors
Probabilistic programming is an area of research that aims to develop general inference algorithms for probabilistic models expressed as probabilistic programs whose execution corresponds to inferring the parameters of those models. In this paper, we introduce a probabilistic programming language (PPL) based on abductive logic programming for performing inference in probabilistic models involving categorical distributions with Dirichlet priors. We encode these models as abductive logic programs enriched with probabilistic definitions and queries, and show how to execute and compile them to boolean formulas. Using the latter, we perform generalized inference using one of two proposed Markov Chain Monte Carlo (MCMC) sampling algorithms: an adaptation of uncollapsed Gibbs sampling from related work and a novel collapsed Gibbs sampling (CGS). We show that CGS converges faster than the uncollapsed version on a latent Dirichlet allocation (LDA) task using synthetic data. On similar data, we compare our PPL with LDA-specific algorithms and other PPLs. We find that all methods, except one, perform similarly and that the more expressive the PPL, the slower it is. We illustrate applications of our PPL on real data in two variants of LDA models (Seed and Cluster LDA), and in the repeated insertion model (RIM). In the latter, our PPL yields similar conclusions to inference with EM for Mallows models
Probabilistic abductive logic programming using Dirichlet priors
Probabilistic programming is an area of research that aims to develop general inference algorithms for probabilistic models expressed as probabilistic programs whose execution corresponds to inferring the parameters of those models. In this paper, we introduce a probabilistic programming language (PPL) based on abductive logic programming for performing inference in probabilistic models involving categorical distributions with Dirichlet priors. We encode these models as abductive logic programs enriched with probabilistic definitions and queries, and show how to execute and compile them to boolean formulas. Using the latter, we perform generalized inference using one of two proposed Markov Chain Monte Carlo (MCMC) sampling algorithms: an adaptation of uncollapsed Gibbs sampling from related work and a novel collapsed Gibbs sampling (CGS). We show that CGS converges faster than the uncollapsed version on a latent Dirichlet allocation (LDA) task using synthetic data. On similar data, we compare our PPL with LDA-specific algorithms and other PPLs. We find that all methods, except one, perform similarly and that the more expressive the PPL, the slower it is. We illustrate applications of our PPL on real data in two variants of LDA models (Seed and Cluster LDA), and in the repeated insertion model (RIM). In the latter, our PPL yields similar conclusions to inference with EM for Mallows models
Probabilistic (logic) programming concepts
A multitude of different probabilistic programming languages exists today, all extending a traditional programming language with primitives to support modeling of complex, structured probability distributions. Each of these languages employs its own probabilistic primitives, and comes with a particular syntax, semantics and inference procedure. This makes it hard to understand the underlying programming concepts and appreciate the differences between the different languages. To obtain a better understanding of probabilistic programming, we identify a number of core programming concepts underlying the primitives used by various probabilistic languages, discuss the execution mechanisms that they require and use these to position and survey state-of-the-art probabilistic languages and their implementation. While doing so, we focus on probabilistic extensions of logic programming languages such as Prolog, which have been considered for over 20 years
Probabilistic Inference in Piecewise Graphical Models
In many applications of probabilistic inference the models
contain piecewise densities that are differentiable except at
partition boundaries. For instance, (1) some models may
intrinsically have finite support, being constrained to some
regions; (2) arbitrary density functions may be approximated by
mixtures of piecewise functions such as piecewise polynomials or
piecewise exponentials; (3) distributions derived from other
distributions (via random variable transformations) may be highly
piecewise; (4) in applications of Bayesian inference such as
Bayesian discrete classification and preference learning, the
likelihood functions may be piecewise; (5) context-specific
conditional probability density functions (tree-CPDs) are
intrinsically piecewise; (6) influence diagrams (generalizations
of Bayesian networks in which along with probabilistic inference,
decision making problems are modeled) are in many applications
piecewise; (7) in probabilistic programming, conditional
statements lead to piecewise models. As we will show, exact
inference on piecewise models is not often scalable (if
applicable) and the performance of the existing approximate
inference techniques on such models is usually quite poor.
This thesis fills this gap by presenting scalable and accurate
algorithms for inference in piecewise probabilistic graphical
models. Our first contribution is to present a variation of Gibbs
sampling algorithm that achieves an exponential sampling speedup
on a large class of models (including Bayesian models with
piecewise likelihood functions). As a second contribution, we
show that for a large range of models, the time-consuming Gibbs
sampling computations that are traditionally carried out per
sample, can be computed symbolically, once and prior to the
sampling process. Among many potential applications, the
resulting symbolic Gibbs sampler can be used for fully automated
reasoning in the presence of deterministic constraints among
random variables. As a third contribution, we are motivated by
the behavior of Hamiltonian dynamics in optics —in particular,
the reflection and refraction of light on the refractive
surfaces— to present a new Hamiltonian Monte Carlo method that
demonstrates a significantly improved performance on piecewise
models.
Hopefully, the present work represents a step towards scalable
and accurate inference in an important class of probabilistic
models that has largely been overlooked in the literature
On the Relationship between Sum-Product Networks and Bayesian Networks
Sum-Product Networks (SPNs), which are probabilistic inference machines, have attracted a lot of interests in recent years. They have a wide range of applications, including but not limited to activity modeling, language modeling and speech modeling. Despite their practical applications and popularity, little research has been done in understanding what is the connection and difference between Sum-Product Networks and traditional graphical models, including Bayesian Networks (BNs) and Markov Networks (MNs). In this thesis, I establish some theoretical connections between Sum-Product Networks and Bayesian Networks. First, I prove that every SPN can be converted into a BN in linear time and space in terms of the network size. Second, I show that by applying the Variable Elimination algorithm (VE) to the generated BN, I can recover the original SPN.
In the first direction, I use Algebraic Decision Diagrams (ADDs) to compactly represent the local conditional probability distributions at each node in the resulting BN by exploiting context-specific independence (CSI). The generated BN has a simple directed bipartite graphical structure. I establish the first connection between the depth of SPNs and the tree-width of the generated BNs, showing that the depth of SPNs is proportional to a lower bound of the tree-width of the BN.
In the other direction, I show that by applying the Variable Elimination algorithm (VE) to the generated BN with ADD representations, I can recover the original SPN where the SPN can be viewed as a history record or caching of the VE inference process. To help state the proof clearly, I introduce the notion of {\em normal} SPN and present a theoretical analysis of the consistency and decomposability properties. I provide constructive algorithms to transform any given SPN into its normal form in time and space quadratic in the size of the SPN. Combining the above two directions gives us a deep understanding about the modeling power of SPNs and their inner working mechanism
Method of Moments in Approximate Bayesian Inference: From Theory to Practice
With recent advances in approximate inference, Bayesian methods have proven successful in larger datasets and more complex models. The central problem in Bayesian inference is how to approximate intractable posteriors accurately and efficiently. Variational inference deals with this problem by projecting the posterior onto a simpler distribution space. The projection step in variational inference is usually done by minimizing Kullback–Leibler divergence, but alternative methods may sometimes yield faster and more accurate solutions. Moments are statistics to describe the shape of a probability distribution, and one can project the distribution by matching a set of moments. The idea of moment matching dates back to the method of moments (MM), a simple approach to estimate unknown parameters by enforcing the moments to match with estimation. While MM has been primarily studied in frequentist statistics, it can lend itself naturally to approximate Bayesian inference.
This thesis aims to better understand how to apply MM in general-purpose Bayesian inference problems and the advantage of MM methods in Bayesian inference. We begin with the simplest model in machine learning and gradually extend to more complex and practical settings. The scope of our work spans from theory, methodology to applications. We first study a specific algorithm that uses MM in mixture posteriors, Bayesian Moment Matching (BMM). We prove consistency of BMM in a naive Bayes model and then propose an initializer to Boolean SAT solvers based on its extension to Bayesian networks. BMM is quite restrictive and can only be used with conjugate priors. We then propose a new algorithm, Multiple Moment Matching Inference (MMMI), a general-purpose approximate Bayesian inference algorithm based on the idea of MM, and demonstrate its competitive predictive performance on real-world datasets
A Review of Symbolic, Subsymbolic and Hybrid Methods for Sequential Decision Making
The field of Sequential Decision Making (SDM) provides tools for solving
Sequential Decision Processes (SDPs), where an agent must make a series of
decisions in order to complete a task or achieve a goal. Historically, two
competing SDM paradigms have view for supremacy. Automated Planning (AP)
proposes to solve SDPs by performing a reasoning process over a model of the
world, often represented symbolically. Conversely, Reinforcement Learning (RL)
proposes to learn the solution of the SDP from data, without a world model, and
represent the learned knowledge subsymbolically. In the spirit of
reconciliation, we provide a review of symbolic, subsymbolic and hybrid methods
for SDM. We cover both methods for solving SDPs (e.g., AP, RL and techniques
that learn to plan) and for learning aspects of their structure (e.g., world
models, state invariants and landmarks). To the best of our knowledge, no other
review in the field provides the same scope. As an additional contribution, we
discuss what properties an ideal method for SDM should exhibit and argue that
neurosymbolic AI is the current approach which most closely resembles this
ideal method. Finally, we outline several proposals to advance the field of SDM
via the integration of symbolic and subsymbolic AI
Proceedings of the ECMLPKDD 2015 Doctoral Consortium
ECMLPKDD 2015 Doctoral Consortium was organized for the second time as part of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD), organised in Porto during September 7-11, 2015. The objective of the doctoral consortium is to provide an environment for students to exchange their ideas and experiences with peers in an interactive atmosphere and to get constructive feedback from senior researchers in machine learning, data mining, and related areas. These proceedings collect together and document all the contributions of the ECMLPKDD 2015 Doctoral Consortium