5 research outputs found
On Pruning for Score-Based Bayesian Network Structure Learning
Many algorithms for score-based Bayesian network structure learning (BNSL),
in particular exact ones, take as input a collection of potentially optimal
parent sets for each variable in the data. Constructing such collections
naively is computationally intensive since the number of parent sets grows
exponentially with the number of variables. Thus, pruning techniques are not
only desirable but essential. While good pruning rules exist for the Bayesian
Information Criterion (BIC), current results for the Bayesian Dirichlet
equivalent uniform (BDeu) score reduce the search space very modestly,
hampering the use of the (often preferred) BDeu. We derive new non-trivial
theoretical upper bounds for the BDeu score that considerably improve on the
state-of-the-art. Since the new bounds are mathematically proven to be tighter
than previous ones and at little extra computational cost, they are a promising
addition to BNSL methods
Learning All Credible Bayesian Network Structures for Model Averaging
A Bayesian network is a widely used probabilistic graphical model with
applications in knowledge discovery and prediction. Learning a Bayesian network
(BN) from data can be cast as an optimization problem using the well-known
score-and-search approach. However, selecting a single model (i.e., the best
scoring BN) can be misleading or may not achieve the best possible accuracy. An
alternative to committing to a single model is to perform some form of Bayesian
or frequentist model averaging, where the space of possible BNs is sampled or
enumerated in some fashion. Unfortunately, existing approaches for model
averaging either severely restrict the structure of the Bayesian network or
have only been shown to scale to networks with fewer than 30 random variables.
In this paper, we propose a novel approach to model averaging inspired by
performance guarantees in approximation algorithms. Our approach has two
primary advantages. First, our approach only considers credible models in that
they are optimal or near-optimal in score. Second, our approach is more
efficient and scales to significantly larger Bayesian networks than existing
approaches.Comment: under review by JMLR. arXiv admin note: substantial text overlap with
arXiv:1811.0503
Entropy-based Pruning for Learning Bayesian Networks using BIC
For decomposable score-based structure learning of Bayesian networks, existing approaches first compute a collection of candidate parent sets for each variable and then optimize over this collection by choosing one parent set for each variable without creating directed cycles while maximizing the total score. We target the task of constructing the collection of candidate parent sets when the score of choice is the Bayesian Information Criterion (BIC). We provide new non-trivial results that can be used to prune the search space of candidate parent sets of each node. We analyze how these new results relate to previous ideas in the literature both theoretically and empirically. We show in experiments with UCI data sets that gains can be significant. Since the new pruning rules are easy to implement and have low computational costs, they can be promptly integrated into all state-of-the-art methods for structure learning of Bayesian networks