12 research outputs found

    On Pruning for Score-Based Bayesian Network Structure Learning

    Get PDF
    Many algorithms for score-based Bayesian network structure learning (BNSL), in particular exact ones, take as input a collection of potentially optimal parent sets for each variable in the data. Constructing such collections naively is computationally intensive since the number of parent sets grows exponentially with the number of variables. Thus, pruning techniques are not only desirable but essential. While good pruning rules exist for the Bayesian Information Criterion (BIC), current results for the Bayesian Dirichlet equivalent uniform (BDeu) score reduce the search space very modestly, hampering the use of the (often preferred) BDeu. We derive new non-trivial theoretical upper bounds for the BDeu score that considerably improve on the state-of-the-art. Since the new bounds are mathematically proven to be tighter than previous ones and at little extra computational cost, they are a promising addition to BNSL methods

    Scalable Exact Parent Sets Identification in Bayesian Networks Learning with Apache Spark

    Full text link
    In Machine Learning, the parent set identification problem is to find a set of random variables that best explain selected variable given the data and some predefined scoring function. This problem is a critical component to structure learning of Bayesian networks and Markov blankets discovery, and thus has many practical applications, ranging from fraud detection to clinical decision support. In this paper, we introduce a new distributed memory approach to the exact parent sets assignment problem. To achieve scalability, we derive theoretical bounds to constraint the search space when MDL scoring function is used, and we reorganize the underlying dynamic programming such that the computational density is increased and fine-grain synchronization is eliminated. We then design efficient realization of our approach in the Apache Spark platform. Through experimental results, we demonstrate that the method maintains strong scalability on a 500-core standalone Spark cluster, and it can be used to efficiently process data sets with 70 variables, far beyond the reach of the currently available solutions

    Learning Bayesian Networks with Incomplete Data by Augmentation

    Get PDF
    We present new algorithms for learning Bayesian networks from data with missing values using a data augmentation approach. An exact Bayesian network learning algorithm is obtained by recasting the problem into a standard Bayesian network learning problem without missing data. To the best of our knowledge, this is the first exact algorithm for this problem. As expected, the exact algorithm does not scale to large domains. We build on the exact method to create an approximate algorithm using a hill-climbing technique. This algorithm scales to large domains so long as a suitable standard structure learning method for complete data is available. We perform a wide range of experiments to demonstrate the benefits of learning Bayesian networks with such new approach

    Learning Bayesian Networks with Thousands of Variables

    Get PDF
    Abstract We present a method for learning Bayesian networks from data sets containing thousands of variables without the need for structure constraints. Our approach is made of two parts. The first is a novel algorithm that effectively explores the space of possible parent sets of a node. It guides the exploration towards the most promising parent sets on the basis of an approximated score function that is computed in constant time. The second part is an improvement of an existing ordering-based algorithm for structure optimization. The new algorithm provably achieves a higher score compared to its original formulation. Our novel approach consistently outperforms the state of the art on very large data sets

    Learning Locally Minimax Optimal Bayesian Networks

    Get PDF
    We consider the problem of learning Bayesian network models in a non-informative setting, where the only available information is a set of observational data, and no background knowledge is available. The problem can be divided into two different subtasks: learning the structure of the network (a set of independence relations), and learning the parameters of the model (that fix the probability distribution from the set of all distributions consistent with the chosen structure). There are not many theoretical frameworks that consistently handle both these problems together, the Bayesian framework being an exception. In this paper we propose an alternative, information-theoretic framework which sidesteps some of the technical problems facing the Bayesian approach. The framework is based on the minimax-optimal Normalized Maximum Likelihood (NML) distribution, which is motivated by the Minimum Description Length (MDL) principle. The resulting model selection criterion is consistent, and it provides a way to construct highly predictive Bayesian network models. Our empirical tests show that the proposed method compares favorably with alternative approaches in both model selection and prediction tasks.

    Improved Building Blocks for Secure Multi-Party Computation based on Secret Sharing with Honest Majority

    Get PDF
    Secure multi-party computation permits evaluation of any desired functionality on private data without disclosing the data to the participants and is gaining its popularity due to increasing collection of user, customer, or patient data and the need to analyze data sets distributed across different organizations without disclosing them. Because adoption of secure computation techniques depends on their performance in practice, it is important to continue improving their performance. In this work, we focus on common non-trivial operations used by many types of programs, and any advances in their performance would impact the runtime of programs that rely on them. In particular, we treat the operation of reading or writing an element of an array at a private location and integer multiplication. The focus of this work is secret sharing setting with honest majority in the semi-honest security model. We demonstrate improvement of the proposed techniques over prior constructions via analytical and empirical evaluation

    Generalized Bayesian Network Classifiers

    Get PDF

    Improved Scalability and Accuracy of Bayesian Network Structure Learning in the Score-and-Search Paradigm

    Get PDF
    A Bayesian network is a probabilistic graphical model that consists of a directed acyclic graph (DAG), where each node is a random variable and attached to each node is a conditional probability distribution (CPD). A Bayesian network (BN) can either be constructed by a domain expert or learned automatically from data using the well-known score-and-search approach, a form of unsupervised machine learning. Our interest here is in BNs as a knowledge discovery or data analysis tool, where the BN is learned automatically from data and the resulting BN is then studied for the insights that it provides on the domain such as possible cause-effect relationships, probabilistic dependencies, and conditional independence relationships. Previous work has shown that the accuracy of a data analysis can be improved by (i) incorporating structured representations of the CPDs into the score-and-search approach for learning the DAG and by (ii) learning a set of DAGs from a dataset, rather than a single DAG, and performing a technique called model averaging to obtain a representative DAG. This thesis focuses on improving the accuracy of the score-and-search approach for learning a BN and in scaling the approach to datasets with larger numbers of random variables. We introduce a novel model averaging approach to learning a BN motivated by performance guarantees in approximation algorithms. Our approach considers all optimal and all near-optimal networks for model averaging. We provide pruning rules that retain optimality while enabling our approach to scale to BNs significantly larger than the current state of the art. We extend our model averaging approach to simultaneously learn the DAG and the local structure of the CPDs in the form of a noisy-OR representation. We provide an effective gradient descent algorithm to score a candidate noisy-OR using the widely used BIC score and we provide pruning rules that allow the search to successfully scale to medium sized networks. Our empirical results provide evidence for the success of our approach to learning Bayesian networks that incorporate noisy-OR relations. We also extend our model averaging approach to simultaneously learn the DAG and the local structure of the CPD using neural networks representations. Our approach compares favourably with approaches like decision trees, and performs well in instances with low amounts of data. Finally, we introduce a score-and-search approach to simultaneously learn a DAG and model linear and non-linear local probabilistic relationships between variables using multivariate adaptive regression splines (MARS). MARS are polynomial regression models represented as piecewise spline functions. We show on a set of discrete and continuous benchmark instances that our proposed approach can improve the accuracy of the learned graph while scaling to instances with over 1,000 variables

    Insights on Learning Tractable Probabilistic Graphical Models

    Get PDF
    corecore