Search CORE

11 research outputs found

Computing Posterior Probabilities of Ancestor Relations in Bayesian Networks

Author: Chen Yetian
Tian Jin
Publication venue: Iowa State University Digital Repository
Publication date: 27/08/2014
Field of study

In this paper we develop a dynamic programming algorithm to compute the exact posterior probabilities of ancestor relations in Bayesian networks. Previous dynamic programming (DP) algorithm by (Parviainen and Koivisto, 2011) evaluates all possible ancestor relations in time O(n3n) and space O(3n) . However, their algorithm assumes an order-modular prior over DAGs that does not respect Markov equivalence. The resulting posteriors would bias towards DAGs consistent with more linear orders. To adhere to uniform prior, we develop a new DP algorithm that computes the exact posteriors of all possible ancestor relations in time O(n5n-1) and space O(3n)

Digital Repository @ Iowa State University (ISU)

A Parallel Algorithm for Exact Bayesian Structure Discovery in Bayesian Networks

Author: Jin Tian
Olga Nikolova
Sage Bionetworks
Srinivas Aluru
Yetian Chen
Publication venue
Publication date: 13/08/2016
Field of study

Exact Bayesian structure discovery in Bayesian networks requires exponential time and space. Using dynamic programming (DP), the fastest known sequential algorithm computes the exact posterior probabilities of structural features in

O(2(d+1)n2^n)

time and space, if the number of nodes (variables) in the Bayesian network is

n

and the in-degree (the number of parents) per node is bounded by a constant

d

. Here we present a parallel algorithm capable of computing the exact posterior probabilities for all

n(n-1)

edges with optimal parallel space efficiency and nearly optimal parallel time efficiency. That is, if

p=2^k

processors are used, the run-time reduces to

O(5(d+1)n2^{n-k}+k(n-k)^d)

and the space usage becomes

O(n2^{n-k})

per processor. Our algorithm is based the observation that the subproblems in the sequential DP algorithm constitute a

n

D

hypercube. We take a delicate way to coordinate the computation of correlated DP procedures such that large amount of data exchange is suppressed. Further, we develop parallel techniques for two variants of the well-known \emph{zeta transform}, which have applications outside the context of Bayesian networks. We demonstrate the capability of our algorithm on datasets with up to 33 variables and its scalability on up to 2048 processors. We apply our algorithm to a biological data set for discovering the yeast pheromone response pathways.Comment: 32 pages, 12 figure

arXiv.org e-Print Archive

CiteSeerX

Exact Learning of Bounded Tree-width Bayesian Networks

Author
Publication venue
Publication date: 05/03/2020
Field of study

Abstract Inference in Bayesian networks is known to be NP-hard, but if the network has bounded treewidth, then inference becomes tractable. Not surprisingly, learning networks that closely match the given data and have a bounded tree-width has recently attracted some attention. In this paper we aim to lay groundwork for future research on the topic by studying the exact complexity of this problem. We give the first non-trivial exact algorithm for the NP-hard problem of finding an optimal Bayesian network of tree-width at most w, with running time 3 n n w+O (1) , and provide an implementation of this algorithm. Additionally, we propose a variant of Bayesian network learning with "super-structures", and show that finding a Bayesian network consistent with a given super-structure is fixedparameter tractable in the tree-width of the super-structure

CiteSeerX

A survey of Bayesian Network structure learning

Author: Chobtham K
Constantinou ACC
Guo Z
Kitson NK
Liu Y
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/01/2023
Field of study

Queen Mary Research Online

Structure Discovery in Bayesian Networks: Algorithms and Applications

Author: Chen Yetian
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2016
Field of study

Bayesian networks are a class of probabilistic graphical models that have been widely used in various tasks for probabilistic inference and causal modeling. A Bayesian network provides a compact, flexible, and interpretable representation of a joint probability distribution. When the network structure is unknown but there are observational data at hand, one can try to learn the network structure from the data. This is called structure discovery. Structure discovery in Bayesian networks is a host of several interesting problem variants. In the optimal Bayesian network learning problem (we call this structure learning), one aims to find a Bayesian network that best explains the data and then utilizes this optimal Bayesian network for predictions or inferences. In others, we are interested in finding the local structural features that are highly probable (we call this structure discovery). Both structure learning and structure discovery are considered very hard because existing approaches to these problems require highly intensive computations. In this dissertation, we develop algorithms to achieve more accurate, efficient and scalable structure discovery in Bayesian networks and demonstrate these algorithms in applications of systems biology and educational data mining. Specifically, this study is conducted in five directions. First of all, we propose a novel heuristic algorithm for Bayesian network structure learning that takes advantage of the idea of curriculum learning and learns Bayesian network structures by stages. We prove theoretical advantages of our algorithm and also empirically show that it outperforms the state-of-the-art heuristic approach in learning Bayesian network structures. Secondly, we develop an algorithm to efficiently enumerate the k-best equivalence classes of Bayesian networks where Bayesian networks in the same equivalence class are equally expressive in terms of representing probability distributions. We demonstrate our algorithm in the task of Bayesian model averaging. Our approach goes beyond the maximum-a-posteriori (MAP) model by listing the most likely network structures and their relative likelihood and therefore has important applications in causal structure discovery. Thirdly, we study how parallelism can be used to tackle the exponential time and space complexity in the exact Bayesian structure discovery. We consider the problem of computing the exact posterior probabilities of directed edges in Bayesian networks. We present a parallel algorithm capable of computing the exact posterior probabilities of all possible directed edges with optimal parallel space efficiency and nearly optimal parallel time efficiency. We apply our algorithm to a biological data set for discovering the yeast pheromone response pathways. Fourthly, we develop novel algorithms for computing the exact posterior probabilities of ancestor relations in Bayesian networks. Existing algorithm assumes an order-modular prior over Bayesian networks that does not respect Markov equivalence. Our algorithm allows uniform prior and respects the Markov equivalence. We apply our algorithm to a biological data set for discovering protein signaling pathways. Finally, we introduce Combined student Modeling and prerequisite Discovery (COMMAND), a novel algorithm for jointly inferring a prerequisite graph and a student model from student performance data. COMMAND learns the skill prerequisite relations as a Bayesian network, which is capable of modeling the global prerequisite structure and capturing the conditional independence between skills. Our experiments on simulations and real student data suggest that COMMAND is better than prior methods in the literature. COMMAND is useful for designing intelligent tutoring systems that assess student knowledge or that offer remediation interventions to students

Digital Repository @ Iowa State University (ISU)