11,675 research outputs found
Bayesian optimization of the PC algorithm for learning Gaussian Bayesian networks
The PC algorithm is a popular method for learning the structure of Gaussian
Bayesian networks. It carries out statistical tests to determine absent edges
in the network. It is hence governed by two parameters: (i) The type of test,
and (ii) its significance level. These parameters are usually set to values
recommended by an expert. Nevertheless, such an approach can suffer from human
bias, leading to suboptimal reconstruction results. In this paper we consider a
more principled approach for choosing these parameters in an automatic way. For
this we optimize a reconstruction score evaluated on a set of different
Gaussian Bayesian networks. This objective is expensive to evaluate and lacks a
closed-form expression, which means that Bayesian optimization (BO) is a
natural choice. BO methods use a model to guide the search and are hence able
to exploit smoothness properties of the objective surface. We show that the
parameters found by a BO method outperform those found by a random search
strategy and the expert recommendation. Importantly, we have found that an
often overlooked statistical test provides the best over-all reconstruction
results
Learning Large-Scale Bayesian Networks with the sparsebn Package
Learning graphical models from data is an important problem with wide
applications, ranging from genomics to the social sciences. Nowadays datasets
often have upwards of thousands---sometimes tens or hundreds of thousands---of
variables and far fewer samples. To meet this challenge, we have developed a
new R package called sparsebn for learning the structure of large, sparse
graphical models with a focus on Bayesian networks. While there are many
existing software packages for this task, this package focuses on the unique
setting of learning large networks from high-dimensional data, possibly with
interventions. As such, the methods provided place a premium on scalability and
consistency in a high-dimensional setting. Furthermore, in the presence of
interventions, the methods implemented here achieve the goal of learning a
causal network from data. Additionally, the sparsebn package is fully
compatible with existing software packages for network analysis.Comment: To appear in the Journal of Statistical Software, 39 pages, 7 figure
Learning the structure of Bayesian Networks: A quantitative assessment of the effect of different algorithmic schemes
One of the most challenging tasks when adopting Bayesian Networks (BNs) is
the one of learning their structure from data. This task is complicated by the
huge search space of possible solutions, and by the fact that the problem is
NP-hard. Hence, full enumeration of all the possible solutions is not always
feasible and approximations are often required. However, to the best of our
knowledge, a quantitative analysis of the performance and characteristics of
the different heuristics to solve this problem has never been done before.
For this reason, in this work, we provide a detailed comparison of many
different state-of-the-arts methods for structural learning on simulated data
considering both BNs with discrete and continuous variables, and with different
rates of noise in the data. In particular, we investigate the performance of
different widespread scores and algorithmic approaches proposed for the
inference and the statistical pitfalls within them
- …