12 research outputs found
Bayesian Learning of Sum-Product Networks
Sum-product networks (SPNs) are flexible density estimators and have received
significant attention due to their attractive inference properties. While
parameter learning in SPNs is well developed, structure learning leaves
something to be desired: Even though there is a plethora of SPN structure
learners, most of them are somewhat ad-hoc and based on intuition rather than a
clear learning principle. In this paper, we introduce a well-principled
Bayesian framework for SPN structure learning. First, we decompose the problem
into i) laying out a computational graph, and ii) learning the so-called scope
function over the graph. The first is rather unproblematic and akin to neural
network architecture validation. The second represents the effective structure
of the SPN and needs to respect the usual structural constraints in SPN, i.e.
completeness and decomposability. While representing and learning the scope
function is somewhat involved in general, in this paper, we propose a natural
parametrisation for an important and widely used special case of SPNs. These
structural parameters are incorporated into a Bayesian model, such that
simultaneous structure and parameter learning is cast into monolithic Bayesian
posterior inference. In various experiments, our Bayesian SPNs often improve
test likelihoods over greedy SPN learners. Further, since the Bayesian
framework protects against overfitting, we can evaluate hyper-parameters
directly on the Bayesian model score, waiving the need for a separate
validation set, which is especially beneficial in low data regimes. Bayesian
SPNs can be applied to heterogeneous domains and can easily be extended to
nonparametric formulations. Moreover, our Bayesian approach is the first, which
consistently and robustly learns SPN structures under missing data.Comment: NeurIPS 2019; See conference page for supplemen
Bayesian Structure and Parameter Learning of Sum-Product Networks
Sum-product networks (SPN) are graphical models capable of handling large amount of multi-
dimensional data. Unlike many other graphical models, SPNs are tractable if certain structural
requirements are fulfilled; a model is called tractable if probabilistic inference can be performed in
a polynomial time with respect to the size of the model. The learning of SPNs can be separated
into two modes, parameter and structure learning. Many earlier approaches to SPN learning have
treated the two modes as separate, but it has been found that by alternating between these two
modes, good results can be achieved. One example of this kind of algorithm was presented by
Trapp et al. in an article Bayesian Learning of Sum-Product Networks (NeurIPS, 2019).
This thesis discusses SPNs and a Bayesian learning algorithm developed based on the earlier men-
tioned algorithm, differing in some of the used methods. The algorithm by Trapp et al. uses Gibbs
sampling in the parameter learning phase, whereas here Metropolis-Hasting MCMC is used. The
algorithm developed for this thesis was used in two experiments, with a small and simple SPN and
with a larger and more complex SPN. Also, the effect of the data set size and the complexity of
the data was explored. The results were compared to the results got from running the original
algorithm developed by Trapp et al.
The results show that having more data in the learning phase makes the results more accurate as
it is easier for the model to spot patterns from a larger set of data. It was also shown that the
model was able to learn the parameters in the experiments if the data were simple enough, in other
words, if the dimensions of the data contained only one distribution per dimension. In the case
of more complex data, where there were multiple distributions per dimension, the struggle of the
computation was seen from the results
SPPL: Probabilistic Programming with Fast Exact Symbolic Inference
We present the Sum-Product Probabilistic Language (SPPL), a new probabilistic
programming language that automatically delivers exact solutions to a broad
range of probabilistic inference queries. SPPL translates probabilistic
programs into sum-product expressions, a new symbolic representation and
associated semantic domain that extends standard sum-product networks to
support mixed-type distributions, numeric transformations, logical formulas,
and pointwise and set-valued constraints. We formalize SPPL via a novel
translation strategy from probabilistic programs to sum-product expressions and
give sound exact algorithms for conditioning on and computing probabilities of
events. SPPL imposes a collection of restrictions on probabilistic programs to
ensure they can be translated into sum-product expressions, which allow the
system to leverage new techniques for improving the scalability of translation
and inference by automatically exploiting probabilistic structure. We implement
a prototype of SPPL with a modular architecture and evaluate it on benchmarks
the system targets, showing that it obtains up to 3500x speedups over
state-of-the-art symbolic systems on tasks such as verifying the fairness of
decision tree classifiers, smoothing hidden Markov models, conditioning
transformed random variables, and computing rare event probabilities
Positive Semi-Definite Probabilistic Circuits
The computation of probabilistic inference operations with models estimating probability distributions is crucial for applications requiring well-calibrated estimates of uncertainty, such as safety critical decision-making, but often becomes computationally intractable due to the model's formulation. Probabilistic circuits (PCs) are models that can guarantee tractability of these operations by virtue of their formulation as structurally constrained computational graphs. PCs encode complex probability distributions as a hierarchical set of non-negatively weighted summations and products of simpler tractable probability distributions. The non-negative weight constraint on summations ensures non-negativity of the probability distribution represented, however, is also known to hinder the expressiveness of PCs. This work proposes loosening this non-negativity constraint to a positive semi-definite (PSD) constraint, yielding a positive semi-definite parameterized PC (PSD-PC). This model can have negative weights and is hypothesized to be more expressive than PCs. PSD-PCs are shown to represent valid probability distributions and proven to retain tractability for probabilistic inference operations. A density estimation experiment conducted on simulated and toy data sets showed empirical evidence for PSD-PCs being more expressive-efficient than PCs, possibly due to increased capability to model negative dependencies. PSD-PCs are, however, also subject to a stricter constraint on their graphical structure than PCs and are more challenging to optimize