162 research outputs found
Structural and parametric uncertainties in full Bayesian and graphical lasso based approaches: beyond edge weights in psychological networks
Uncertainty over model structures poses a challenge
for many approaches exploring effect strength parameters at
system-level. Monte Carlo methods for full Bayesian model
averaging over model structures require considerable computational
resources, whereas bootstrapped graphical lasso and its
approximations offer scalable alternatives with lower complexity.
Although the computational efficiency of graphical lasso based
approaches has prompted growing number of applications, the
restrictive assumptions of this approach are frequently ignored,
such as its lack of coping with interactions. We demonstrate
using an artificial and a real-world example that full Bayesian
averaging using Bayesian networks provides detailed estimates
through posterior distributions for structural and parametric
uncertainties and it is a feasible alternative, which is routinely
applicable in mid-sized biomedical problems with hundreds of
variables. We compare Bayesian estimates with corresponding
frequentist quantities from bootstrapped graphical lasso using
pairwise Markov Random Fields, discussing also their interpretational
differences. We present results using synthetic data from
an artificial model and using the UK Biobank data set to explore
a psychopathological network centered around depression (this
research has been conducted using the UK Biobank Resource
under Application Number 1602)
Complexity analysis of Bayesian learning of high-dimensional DAG models and their equivalence classes
We consider MCMC methods for learning equivalence classes of sparse Gaussian
DAG models when . The main contribution of this work is a rapid
mixing result for a random walk Metropolis-Hastings algorithm, which we prove
using a canonical path method. It reveals that the complexity of Bayesian
learning of sparse equivalence classes grows only polynomially in and ,
under some common high-dimensional assumptions. Further, a series of
high-dimensional consistency results is obtained by the path method, including
the strong selection consistency of an empirical Bayes model for structure
learning and the consistency of a greedy local search on the restricted search
space. Rapid mixing and slow mixing results for other structure-learning MCMC
methods are also derived. Our path method and mixing time results yield crucial
insights into the computational aspects of high-dimensional structure learning,
which may be used to develop more efficient MCMC algorithms
Structure Discovery in Bayesian Networks: Algorithms and Applications
Bayesian networks are a class of probabilistic graphical models that have been widely used in various tasks for probabilistic inference and causal modeling. A Bayesian network provides a compact, flexible, and interpretable representation of a joint probability distribution. When the network structure is unknown but there are observational data at hand, one can try to learn the network structure from the data. This is called structure discovery.
Structure discovery in Bayesian networks is a host of several interesting problem variants. In the optimal Bayesian network learning problem (we call this structure learning), one aims to find a Bayesian network that best explains the data and then utilizes this optimal Bayesian network for predictions or inferences. In others, we are interested in finding the local structural features that are highly probable (we call this structure discovery). Both structure learning and structure discovery are considered very hard because existing approaches to these problems require highly intensive computations.
In this dissertation, we develop algorithms to achieve more accurate, efficient and scalable structure discovery in Bayesian networks and demonstrate these algorithms in applications of systems biology and educational data mining. Specifically, this study is conducted in five directions.
First of all, we propose a novel heuristic algorithm for Bayesian network structure learning that takes advantage of the idea of curriculum learning and learns Bayesian network structures by stages. We prove theoretical advantages of our algorithm and also empirically show that it outperforms the state-of-the-art heuristic approach in learning Bayesian network structures.
Secondly, we develop an algorithm to efficiently enumerate the k-best equivalence classes of Bayesian networks where Bayesian networks in the same equivalence class are equally expressive in terms of representing probability distributions. We demonstrate our algorithm in the task of Bayesian model averaging. Our approach goes beyond the maximum-a-posteriori (MAP) model by listing the most likely network structures and their relative likelihood and therefore has important applications in causal structure discovery.
Thirdly, we study how parallelism can be used to tackle the exponential time and space complexity in the exact Bayesian structure discovery. We consider the problem of computing the exact posterior probabilities of directed edges in Bayesian networks. We present a parallel algorithm capable of computing the exact posterior probabilities of all possible directed edges with optimal parallel space efficiency and nearly optimal parallel time efficiency. We apply our algorithm to a biological data set for discovering the yeast pheromone response pathways.
Fourthly, we develop novel algorithms for computing the exact posterior probabilities of ancestor relations in Bayesian networks. Existing algorithm assumes an order-modular prior over Bayesian networks that does not respect Markov equivalence. Our algorithm allows uniform prior and respects the Markov equivalence. We apply our algorithm to a biological data set for discovering protein signaling pathways.
Finally, we introduce Combined student Modeling and prerequisite Discovery (COMMAND), a novel algorithm for jointly inferring a prerequisite graph and a student model from student performance data. COMMAND learns the skill prerequisite relations as a Bayesian network, which is capable of modeling the global prerequisite structure and capturing the conditional independence between skills. Our experiments on simulations and real student data suggest that COMMAND is better than prior methods in the literature. COMMAND is useful for designing intelligent tutoring systems that assess student knowledge or that offer remediation interventions to students
DETECTING CANCER-RELATED GENES AND GENE-GENE INTERACTIONS BY MACHINE LEARNING METHODS
To understand the underlying molecular mechanisms of cancer and therefore to improve pathogenesis, prevention, diagnosis and treatment of cancer, it is necessary to explore the activities of cancer-related genes and the interactions among these genes. In this dissertation, I use machine learning and computational methods to identify differential gene relations and detect gene-gene interactions. To identify gene pairs that have different relationships in normal versus cancer tissues, I develop an integrative method based on the bootstrapping K-S test to evaluate a large number of microarray datasets. The experimental results demonstrate that my method can find meaningful alterations in gene relations. For gene-gene interaction detection, I propose to use two Bayesian Network based methods: DASSO-MB (Detection of ASSOciations using Markov Blanket) and EpiBN (Epistatic interaction detection using Bayesian Network model) to address the two critical challenges: searching and scoring. DASSO-MB is based on the concept of Markov Blanket in Bayesian Networks. In EpiBN, I develop a new scoring function, which can reflect higher-order gene-gene interactions and detect the true number of disease markers, and apply a fast Branch-and-Bound (B&B) algorithm to learn the structure of Bayesian Network. Both DASSO-MB and EpiBN outperform some other commonly-used methods and are scalable to genome-wide data
Combinatorial and algebraic perspectives on the marginal independence structure of Bayesian networks
We consider the problem of estimating the marginal independence structure of
a Bayesian network from observational data in the form of an undirected graph
called the unconditional dependence graph. We show that unconditional
dependence graphs of Bayesian networks correspond to the graphs having equal
independence and intersection numbers. Using this observation, a Gr\"obner
basis for a toric ideal associated to unconditional dependence graphs of
Bayesian networks is given and then extended by additional binomial relations
to connect the space of all such graphs. An MCMC method, called GrUES
(Gr\"obner-based Unconditional Equivalence Search), is implemented based on the
resulting moves and applied to synthetic Gaussian data. GrUES recovers the true
marginal independence structure via a penalized maximum likelihood or MAP
estimate at a higher rate than simple independence tests while also yielding an
estimate of the posterior, for which the HPD credible sets include the
true structure at a high rate for data-generating graphs with density at least
.Comment: 60 pages, 13 figure, 3 table
Some models are useful, but how do we know which ones? Towards a unified Bayesian model taxonomy
Probabilistic (Bayesian) modeling has experienced a surge of applications in
almost all quantitative sciences and industrial areas. This development is
driven by a combination of several factors, including better probabilistic
estimation algorithms, flexible software, increased computing power, and a
growing awareness of the benefits of probabilistic learning. However, a
principled Bayesian model building workflow is far from complete and many
challenges remain. To aid future research and applications of a principled
Bayesian workflow, we ask and provide answers for what we perceive as two
fundamental questions of Bayesian modeling, namely (a) "What actually is a
Bayesian model?" and (b) "What makes a good Bayesian model?". As an answer to
the first question, we propose the PAD model taxonomy that defines four basic
kinds of Bayesian models, each representing some combination of the assumed
joint distribution of all (known or unknown) variables (P), a posterior
approximator (A), and training data (D). As an answer to the second question,
we propose ten utility dimensions according to which we can evaluate Bayesian
models holistically, namely, (1) causal consistency, (2) parameter
recoverability, (3) predictive performance, (4) fairness, (5) structural
faithfulness, (6) parsimony, (7) interpretability, (8) convergence, (9)
estimation speed, and (10) robustness. Further, we propose two example utility
decision trees that describe hierarchies and trade-offs between utilities
depending on the inferential goals that drive model building and testing
- …