71 research outputs found
Bayesian Learning of Sum-Product Networks
Sum-product networks (SPNs) are flexible density estimators and have received
significant attention due to their attractive inference properties. While
parameter learning in SPNs is well developed, structure learning leaves
something to be desired: Even though there is a plethora of SPN structure
learners, most of them are somewhat ad-hoc and based on intuition rather than a
clear learning principle. In this paper, we introduce a well-principled
Bayesian framework for SPN structure learning. First, we decompose the problem
into i) laying out a computational graph, and ii) learning the so-called scope
function over the graph. The first is rather unproblematic and akin to neural
network architecture validation. The second represents the effective structure
of the SPN and needs to respect the usual structural constraints in SPN, i.e.
completeness and decomposability. While representing and learning the scope
function is somewhat involved in general, in this paper, we propose a natural
parametrisation for an important and widely used special case of SPNs. These
structural parameters are incorporated into a Bayesian model, such that
simultaneous structure and parameter learning is cast into monolithic Bayesian
posterior inference. In various experiments, our Bayesian SPNs often improve
test likelihoods over greedy SPN learners. Further, since the Bayesian
framework protects against overfitting, we can evaluate hyper-parameters
directly on the Bayesian model score, waiving the need for a separate
validation set, which is especially beneficial in low data regimes. Bayesian
SPNs can be applied to heterogeneous domains and can easily be extended to
nonparametric formulations. Moreover, our Bayesian approach is the first, which
consistently and robustly learns SPN structures under missing data.Comment: NeurIPS 2019; See conference page for supplemen
Learning Logistic Circuits
This paper proposes a new classification model called logistic circuits. On
MNIST and Fashion datasets, our learning algorithm outperforms neural networks
that have an order of magnitude more parameters. Yet, logistic circuits have a
distinct origin in symbolic AI, forming a discriminative counterpart to
probabilistic-logical circuits such as ACs, SPNs, and PSDDs. We show that
parameter learning for logistic circuits is convex optimization, and that a
simple local search algorithm can induce strong model structures from data.Comment: Published in the Proceedings of the Thirty-Third AAAI Conference on
Artificial Intelligence (AAAI19
Learning Tractable Probabilistic Models for Fault Localization
In recent years, several probabilistic techniques have been applied to
various debugging problems. However, most existing probabilistic debugging
systems use relatively simple statistical models, and fail to generalize across
multiple programs. In this work, we propose Tractable Fault Localization Models
(TFLMs) that can be learned from data, and probabilistically infer the location
of the bug. While most previous statistical debugging methods generalize over
many executions of a single program, TFLMs are trained on a corpus of
previously seen buggy programs, and learn to identify recurring patterns of
bugs. Widely-used fault localization techniques such as TARANTULA evaluate the
suspiciousness of each line in isolation; in contrast, a TFLM defines a joint
probability distribution over buggy indicator variables for each line. Joint
distributions with rich dependency structure are often computationally
intractable; TFLMs avoid this by exploiting recent developments in tractable
probabilistic models (specifically, Relational SPNs). Further, TFLMs can
incorporate additional sources of information, including coverage-based
features such as TARANTULA. We evaluate the fault localization performance of
TFLMs that include TARANTULA scores as features in the probabilistic model. Our
study shows that the learned TFLMs isolate bugs more effectively than previous
statistical methods or using TARANTULA directly.Comment: Fifth International Workshop on Statistical Relational AI (StaR-AI
2015
Conditional Sum-Product Networks: Imposing Structure on Deep Probabilistic Architectures
Probabilistic graphical models are a central tool in AI; however, they are
generally not as expressive as deep neural models, and inference is notoriously
hard and slow. In contrast, deep probabilistic models such as sum-product
networks (SPNs) capture joint distributions in a tractable fashion, but still
lack the expressive power of intractable models based on deep neural networks.
Therefore, we introduce conditional SPNs (CSPNs), conditional density
estimators for multivariate and potentially hybrid domains which allow
harnessing the expressive power of neural networks while still maintaining
tractability guarantees. One way to implement CSPNs is to use an existing SPN
structure and condition its parameters on the input, e.g., via a deep neural
network. This approach, however, might misrepresent the conditional
independence structure present in data. Consequently, we also develop a
structure-learning approach that derives both the structure and parameters of
CSPNs from data. Our experimental evidence demonstrates that CSPNs are
competitive with other probabilistic models and yield superior performance on
multilabel image classification compared to mean field and mixture density
networks. Furthermore, they can successfully be employed as building blocks for
structured probabilistic models, such as autoregressive image models.Comment: 13 pages, 6 figure
- …