34,030 research outputs found
Parallelization of the PC Algorithm
This paper describes a parallel version of the PC algorithm
for learning the structure of a Bayesian network from data. The PC
algorithm is a constraint-based algorithm consisting of fi ve steps where
the first step is to perform a set of (conditional) independence tests
while the remaining four steps relate to identifying the structure of the
Bayesian network using the results of the (conditional) independence
tests. In this paper, we describe a new approach to parallelization of the
(conditional) independence testing as experiments illustrate that this is
by far the most time consuming step. The proposed parallel PC algorithm
is evaluated on data sets generated at random from five different real-
world Bayesian networks. The results demonstrate that signi cant time
performance improvements are possible using the proposed algorithm
Reasoning about Independence in Probabilistic Models of Relational Data
We extend the theory of d-separation to cases in which data instances are not
independent and identically distributed. We show that applying the rules of
d-separation directly to the structure of probabilistic models of relational
data inaccurately infers conditional independence. We introduce relational
d-separation, a theory for deriving conditional independence facts from
relational models. We provide a new representation, the abstract ground graph,
that enables a sound, complete, and computationally efficient method for
answering d-separation queries about relational models, and we present
empirical results that demonstrate effectiveness.Comment: 61 pages, substantial revisions to formalisms, theory, and related
wor
Application of new probabilistic graphical models in the genetic regulatory networks studies
This paper introduces two new probabilistic graphical models for
reconstruction of genetic regulatory networks using DNA microarray data. One is
an Independence Graph (IG) model with either a forward or a backward search
algorithm and the other one is a Gaussian Network (GN) model with a novel
greedy search method. The performances of both models were evaluated on four
MAPK pathways in yeast and three simulated data sets. Generally, an IG model
provides a sparse graph but a GN model produces a dense graph where more
information about gene-gene interactions is preserved. Additionally, we found
two key limitations in the prediction of genetic regulatory networks using DNA
microarray data, the first is the sufficiency of sample size and the second is
the complexity of network structures may not be captured without additional
data at the protein level. Those limitations are present in all prediction
methods which used only DNA microarray data.Comment: 38 pages, 3 figure
Bayesian Networks for Max-linear Models
We study Bayesian networks based on max-linear structural equations as
introduced in Gissibl and Kl\"uppelberg [16] and provide a summary of their
independence properties. In particular we emphasize that distributions for such
networks are generally not faithful to the independence model determined by
their associated directed acyclic graph. In addition, we consider some of the
basic issues of estimation and discuss generalized maximum likelihood
estimation of the coefficients, using the concept of a generalized likelihood
ratio for non-dominated families as introduced by Kiefer and Wolfowitz [21].
Finally we argue that the structure of a minimal network asymptotically can be
identified completely from observational data.Comment: 18 page
- …