Search CORE

225 research outputs found

New Results for the MAP Problem in Bayesian Networks

Author: de Campos Cassio P.
Publication venue
Publication date: 01/01/2010
Field of study

This paper presents new results for the (partial) maximum a posteriori (MAP) problem in Bayesian networks, which is the problem of querying the most probable state configuration of some of the network variables given evidence. First, it is demonstrated that the problem remains hard even in networks with very simple topology, such as binary polytrees and simple trees (including the Naive Bayes structure). Such proofs extend previous complexity results for the problem. Inapproximability results are also derived in the case of trees if the number of states per variable is not bounded. Although the problem is shown to be hard and inapproximable even in very simple scenarios, a new exact algorithm is described that is empirically fast in networks of bounded treewidth and bounded number of states per variable. The same algorithm is used as basis of a Fully Polynomial Time Approximation Scheme for MAP under such assumptions. Approximation schemes were generally thought to be impossible for this problem, but we show otherwise for classes of networks that are important in practice. The algorithms are extensively tested using some well-known networks as well as random generated cases to show their effectiveness.Comment: A couple of typos were fixed, as well as the notation in part of section 4, which was misleading. Theoretical and empirical results have not change

arXiv.org e-Print Archive

Queen's University Belfast Research Portal

Repository TU/e

Approximation Complexity of Maximum A Posteriori Inference in Sum-Product Networks

Author: Conaty Diarmaid
de Campos Cassio P.
Mauá Denis D.
Publication venue
Publication date: 01/08/2017
Field of study

We discuss the computational complexity of approximating maximum a posteriori inference in sum-product networks. We first show NP-hardness in trees of height two by a reduction from maximum independent set; this implies non-approximability within a sublinear factor. We show that this is a tight bound, as we can find an approximation within a linear factor in networks of height two. We then show that, in trees of height three, it is NP-hard to approximate the problem within a factor

2^{f(n)}

for any sublinear function

f

of the size of the input

n

. Again, this bound is tight, as we prove that the usual max-product algorithm finds (in any network) approximations within factor

2^{c \cdot n}

for some constant

c < 1

. Last, we present a simple algorithm, and show that it provably produces solutions at least as good as, and potentially much better than, the max-product algorithm. We empirically analyze the proposed algorithm against max-product using synthetic and realistic networks.Comment: 18 page

arXiv.org e-Print Archive

Queen's University Belfast Research Portal

On Pruning for Score-Based Bayesian Network Structure Learning

Author: Correia Alvaro H. C.
Cussens James
de Campos Cassio
Publication venue
Publication date: 01/01/2019
Field of study

Many algorithms for score-based Bayesian network structure learning (BNSL), in particular exact ones, take as input a collection of potentially optimal parent sets for each variable in the data. Constructing such collections naively is computationally intensive since the number of parent sets grows exponentially with the number of variables. Thus, pruning techniques are not only desirable but essential. While good pruning rules exist for the Bayesian Information Criterion (BIC), current results for the Bayesian Dirichlet equivalent uniform (BDeu) score reduce the search space very modestly, hampering the use of the (often preferred) BDeu. We derive new non-trivial theoretical upper bounds for the BDeu score that considerably improve on the state-of-the-art. Since the new bounds are mathematically proven to be tighter than previous ones and at little extra computational cost, they are a promising addition to BNSL methods

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository

Explore Bristol Research

Confidence Statements for Ordering Quantiles

Author: de Campos Cassio P.
Pereira Carlos A. de B.
Polpo Adriano
Publication venue: 'MDPI AG'
Publication date: 17/07/2014
Field of study

This work proposes Quor, a simple yet effective nonparametric method to compare independent samples with respect to corresponding quantiles of their populations. The method is solely based on the order statistics of the samples, and independence is its only requirement. All computations are performed using exact distributions with no need for any asymptotic considerations, and yet can be run using a fast quadratic-time dynamic programming idea. Computational performance is essential in high-dimensional domains, such as gene expression data. We describe the approach and discuss on the most important assumptions, building a parallel with assumptions and properties of widely used techniques for the same problem. Experiments using real data from biomedical studies are performed to empirically compare Quor and other methods in a classification task over a selection of high-dimensional data sets

arXiv.org e-Print Archive

CiteSeerX

Anytime Marginal MAP Inference

Author: De Campos Cassio
Maua Denis
Publication venue
Publication date: 01/01/2012
Field of study

This paper presents a new anytime algorithm for the marginal MAP problem in graphical models. The algorithm is described in detail, its complexity and convergence rate are studied, and relations to previous theoretical results for the problem are discussed. It is shown that the algorithm runs in polynomial-time if the underlying graph of the model has bounded tree-width, and that it provides guarantees to the lower and upper bounds obtained within a fixed amount of computational resources. Experiments with both real and synthetic generated models highlight its main characteristics and show that it compares favorably against Park and Darwiche's systematic search, particularly in the case of problems with many MAP variables and moderate tree-width.Comment: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012

arXiv.org e-Print Archive

Queen's University Belfast Research Portal

Repository TU/e

Learning Bounded Treewidth Bayesian Networks with Thousands of Variables

Author: Corani Giorgio
de Campos Cassio P.
Scanagatta Mauro
Zaffalon Marco
Publication venue
Publication date: 11/05/2016
Field of study

We present a method for learning treewidth-bounded Bayesian networks from data sets containing thousands of variables. Bounding the treewidth of a Bayesian greatly reduces the complexity of inferences. Yet, being a global property of the graph, it considerably increases the difficulty of the learning process. We propose a novel algorithm for this task, able to scale to large domains and large treewidths. Our novel approach consistently outperforms the state of the art on data sets with up to ten thousand variables

arXiv.org e-Print Archive

Queen's University Belfast Research Portal

Bayesian Dependence Tests for Continuous, Binary and Mixed Continuous-Binary Variables

Author: Benavoli Alessio
de Campos Cassio P.
Publication venue: 'MDPI AG'
Publication date: 01/01/2016
Field of study

Tests for dependence of continuous, discrete and mixed continuous-discrete variables are ubiquitous in science. The goal of this paper is to derive Bayesian alternatives to frequentist null hypothesis significance tests for dependence. In particular, we will present three Bayesian tests for dependence of binary, continuous and mixed variables. These tests are nonparametric and based on the Dirichlet Process, which allows us to use the same prior model for all of them. Therefore, the tests are “consistent” among each other, in the sense that the probabilities that variables are dependent computed with these tests are commensurable across the different types of variables being tested. By means of simulations with artificial data, we show the effectiveness of the new tests

Queen's University Belfast Research Portal

Repository TU/e

Crossref

Directory of Open Access Journals

Advances in Learning Bayesian Networks of Bounded Treewidth

Author: de Campos Cassio Polpo
Ji Qiang
Maua Denis Deratani
Nie Siqi
Publication venue
Publication date: 01/01/2014
Field of study

This work presents novel algorithms for learning Bayesian network structures with bounded treewidth. Both exact and approximate methods are developed. The exact method combines mixed-integer linear programming formulations for structure learning and treewidth computation. The approximate method consists in uniformly sampling

k

-trees (maximal graphs of treewidth

k

), and subsequently selecting, exactly or approximately, the best structure whose moral graph is a subgraph of that

k

-tree. Some properties of these methods are discussed and proven. The approaches are empirically compared to each other and to a state-of-the-art method for learning bounded treewidth structures on a collection of public data sets with up to 100 variables. The experiments show that our exact algorithm outperforms the state of the art, and that the approximate approach is fairly accurate.Comment: 23 pages, 2 figures, 3 table

arXiv.org e-Print Archive

Queen's University Belfast Research Portal

CiteSeerX

Repository TU/e

Learning Bayesian Networks with Incomplete Data by Augmentation

Author: Adel Tameem
de Campos Cassio P.
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 08/10/2016
Field of study

We present new algorithms for learning Bayesian networks from data with missing values using a data augmentation approach. An exact Bayesian network learning algorithm is obtained by recasting the problem into a standard Bayesian network learning problem without missing data. To the best of our knowledge, this is the first exact algorithm for this problem. As expected, the exact algorithm does not scale to large domains. We build on the exact method to create an approximate algorithm using a hill-climbing technique. This algorithm scales to large domains so long as a suitable standard structure learning method for complete data is available. We perform a wide range of experiments to demonstrate the benefits of learning Bayesian networks with such new approach

arXiv.org e-Print Archive

Queen's University Belfast Research Portal

Enlighten

Association for the Advancement of Artificial Intelligence: AAAI Publications