Search CORE

162 research outputs found

Structural and parametric uncertainties in full Bayesian and graphical lasso based approaches: beyond edge weights in psychological networks

Author: Antal Péter
Deakin Bill
Hullám Gábor
Juhász Gabriella
Publication venue
Publication date: 01/01/2017
Field of study

Uncertainty over model structures poses a challenge for many approaches exploring effect strength parameters at system-level. Monte Carlo methods for full Bayesian model averaging over model structures require considerable computational resources, whereas bootstrapped graphical lasso and its approximations offer scalable alternatives with lower complexity. Although the computational efficiency of graphical lasso based approaches has prompted growing number of applications, the restrictive assumptions of this approach are frequently ignored, such as its lack of coping with interactions. We demonstrate using an artificial and a real-world example that full Bayesian averaging using Bayesian networks provides detailed estimates through posterior distributions for structural and parametric uncertainties and it is a feasible alternative, which is routinely applicable in mid-sized biomedical problems with hundreds of variables. We compare Bayesian estimates with corresponding frequentist quantities from bootstrapped graphical lasso using pairwise Markov Random Fields, discussing also their interpretational differences. We present results using synthetic data from an artificial model and using the UK Biobank data set to explore a psychopathological network centered around depression (this research has been conducted using the UK Biobank Resource under Application Number 1602)

Crossref

Repository of the Academy's Library

The University of Manchester - Institutional Repository

A survey of Bayesian Network structure learning

Author: Chobtham K
Constantinou ACC
Guo Z
Kitson NK
Liu Y
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/01/2023
Field of study

Queen Mary Research Online

Complexity analysis of Bayesian learning of high-dimensional DAG models and their equivalence classes

Author: Chang Hyunwoong
Zhou Quan
Publication venue
Publication date: 11/01/2021
Field of study

We consider MCMC methods for learning equivalence classes of sparse Gaussian DAG models when

p = e^{o(n)}

. The main contribution of this work is a rapid mixing result for a random walk Metropolis-Hastings algorithm, which we prove using a canonical path method. It reveals that the complexity of Bayesian learning of sparse equivalence classes grows only polynomially in

n

and

p

, under some common high-dimensional assumptions. Further, a series of high-dimensional consistency results is obtained by the path method, including the strong selection consistency of an empirical Bayes model for structure learning and the consistency of a greedy local search on the restricted search space. Rapid mixing and slow mixing results for other structure-learning MCMC methods are also derived. Our path method and mixing time results yield crucial insights into the computational aspects of high-dimensional structure learning, which may be used to develop more efficient MCMC algorithms

arXiv.org e-Print Archive

Structure Discovery in Bayesian Networks: Algorithms and Applications

Author: Chen Yetian
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2016
Field of study

Bayesian networks are a class of probabilistic graphical models that have been widely used in various tasks for probabilistic inference and causal modeling. A Bayesian network provides a compact, flexible, and interpretable representation of a joint probability distribution. When the network structure is unknown but there are observational data at hand, one can try to learn the network structure from the data. This is called structure discovery. Structure discovery in Bayesian networks is a host of several interesting problem variants. In the optimal Bayesian network learning problem (we call this structure learning), one aims to find a Bayesian network that best explains the data and then utilizes this optimal Bayesian network for predictions or inferences. In others, we are interested in finding the local structural features that are highly probable (we call this structure discovery). Both structure learning and structure discovery are considered very hard because existing approaches to these problems require highly intensive computations. In this dissertation, we develop algorithms to achieve more accurate, efficient and scalable structure discovery in Bayesian networks and demonstrate these algorithms in applications of systems biology and educational data mining. Specifically, this study is conducted in five directions. First of all, we propose a novel heuristic algorithm for Bayesian network structure learning that takes advantage of the idea of curriculum learning and learns Bayesian network structures by stages. We prove theoretical advantages of our algorithm and also empirically show that it outperforms the state-of-the-art heuristic approach in learning Bayesian network structures. Secondly, we develop an algorithm to efficiently enumerate the k-best equivalence classes of Bayesian networks where Bayesian networks in the same equivalence class are equally expressive in terms of representing probability distributions. We demonstrate our algorithm in the task of Bayesian model averaging. Our approach goes beyond the maximum-a-posteriori (MAP) model by listing the most likely network structures and their relative likelihood and therefore has important applications in causal structure discovery. Thirdly, we study how parallelism can be used to tackle the exponential time and space complexity in the exact Bayesian structure discovery. We consider the problem of computing the exact posterior probabilities of directed edges in Bayesian networks. We present a parallel algorithm capable of computing the exact posterior probabilities of all possible directed edges with optimal parallel space efficiency and nearly optimal parallel time efficiency. We apply our algorithm to a biological data set for discovering the yeast pheromone response pathways. Fourthly, we develop novel algorithms for computing the exact posterior probabilities of ancestor relations in Bayesian networks. Existing algorithm assumes an order-modular prior over Bayesian networks that does not respect Markov equivalence. Our algorithm allows uniform prior and respects the Markov equivalence. We apply our algorithm to a biological data set for discovering protein signaling pathways. Finally, we introduce Combined student Modeling and prerequisite Discovery (COMMAND), a novel algorithm for jointly inferring a prerequisite graph and a student model from student performance data. COMMAND learns the skill prerequisite relations as a Bayesian network, which is capable of modeling the global prerequisite structure and capturing the conditional independence between skills. Our experiments on simulations and real student data suggest that COMMAND is better than prior methods in the literature. COMMAND is useful for designing intelligent tutoring systems that assess student knowledge or that offer remediation interventions to students

Digital Repository @ Iowa State University (ISU)

DETECTING CANCER-RELATED GENES AND GENE-GENE INTERACTIONS BY MACHINE LEARNING METHODS

Author: Han Bing
Publication venue: 'Paleontological Institute at The University of Kansas'
Publication date: 01/01/2011
Field of study

To understand the underlying molecular mechanisms of cancer and therefore to improve pathogenesis, prevention, diagnosis and treatment of cancer, it is necessary to explore the activities of cancer-related genes and the interactions among these genes. In this dissertation, I use machine learning and computational methods to identify differential gene relations and detect gene-gene interactions. To identify gene pairs that have different relationships in normal versus cancer tissues, I develop an integrative method based on the bootstrapping K-S test to evaluate a large number of microarray datasets. The experimental results demonstrate that my method can find meaningful alterations in gene relations. For gene-gene interaction detection, I propose to use two Bayesian Network based methods: DASSO-MB (Detection of ASSOciations using Markov Blanket) and EpiBN (Epistatic interaction detection using Bayesian Network model) to address the two critical challenges: searching and scoring. DASSO-MB is based on the concept of Markov Blanket in Bayesian Networks. In EpiBN, I develop a new scoring function, which can reflect higher-order gene-gene interactions and detect the true number of disease markers, and apply a fast Branch-and-Bound (B&B) algorithm to learn the structure of Bayesian Network. Both DASSO-MB and EpiBN outperform some other commonly-used methods and are scalable to genome-wide data

KU ScholarWorks

Combinatorial and algebraic perspectives on the marginal independence structure of Bayesian networks

Author: Deligeorgaki Danai
Markham Alex
Misra Pratik
Solus Liam
Publication venue
Publication date: 25/09/2023
Field of study

We consider the problem of estimating the marginal independence structure of a Bayesian network from observational data in the form of an undirected graph called the unconditional dependence graph. We show that unconditional dependence graphs of Bayesian networks correspond to the graphs having equal independence and intersection numbers. Using this observation, a Gr\"obner basis for a toric ideal associated to unconditional dependence graphs of Bayesian networks is given and then extended by additional binomial relations to connect the space of all such graphs. An MCMC method, called GrUES (Gr\"obner-based Unconditional Equivalence Search), is implemented based on the resulting moves and applied to synthetic Gaussian data. GrUES recovers the true marginal independence structure via a penalized maximum likelihood or MAP estimate at a higher rate than simple independence tests while also yielding an estimate of the posterior, for which the

20\%

HPD credible sets include the true structure at a high rate for data-generating graphs with density at least

0.5

.Comment: 60 pages, 13 figure, 3 table

arXiv.org e-Print Archive

Some models are useful, but how do we know which ones? Towards a unified Bayesian model taxonomy

Author: Bürkner Paul-Christian
Radev Stefan T.
Scholz Maximilian
Publication venue
Publication date: 21/08/2023
Field of study

Probabilistic (Bayesian) modeling has experienced a surge of applications in almost all quantitative sciences and industrial areas. This development is driven by a combination of several factors, including better probabilistic estimation algorithms, flexible software, increased computing power, and a growing awareness of the benefits of probabilistic learning. However, a principled Bayesian model building workflow is far from complete and many challenges remain. To aid future research and applications of a principled Bayesian workflow, we ask and provide answers for what we perceive as two fundamental questions of Bayesian modeling, namely (a) "What actually is a Bayesian model?" and (b) "What makes a good Bayesian model?". As an answer to the first question, we propose the PAD model taxonomy that defines four basic kinds of Bayesian models, each representing some combination of the assumed joint distribution of all (known or unknown) variables (P), a posterior approximator (A), and training data (D). As an answer to the second question, we propose ten utility dimensions according to which we can evaluate Bayesian models holistically, namely, (1) causal consistency, (2) parameter recoverability, (3) predictive performance, (4) fairness, (5) structural faithfulness, (6) parsimony, (7) interpretability, (8) convergence, (9) estimation speed, and (10) robustness. Further, we propose two example utility decision trees that describe hierarchies and trade-offs between utilities depending on the inferential goals that drive model building and testing

arXiv.org e-Print Archive