29,896 research outputs found
Dirichlet Bayesian Network Scores and the Maximum Relative Entropy Principle
A classic approach for learning Bayesian networks from data is to identify a
maximum a posteriori (MAP) network structure. In the case of discrete Bayesian
networks, MAP networks are selected by maximising one of several possible
Bayesian Dirichlet (BD) scores; the most famous is the Bayesian Dirichlet
equivalent uniform (BDeu) score from Heckerman et al (1995). The key properties
of BDeu arise from its uniform prior over the parameters of each local
distribution in the network, which makes structure learning computationally
efficient; it does not require the elicitation of prior knowledge from experts;
and it satisfies score equivalence.
In this paper we will review the derivation and the properties of BD scores,
and of BDeu in particular, and we will link them to the corresponding entropy
estimates to study them from an information theoretic perspective. To this end,
we will work in the context of the foundational work of Giffin and Caticha
(2007), who showed that Bayesian inference can be framed as a particular case
of the maximum relative entropy principle. We will use this connection to show
that BDeu should not be used for structure learning from sparse data, since it
violates the maximum relative entropy principle; and that it is also
problematic from a more classic Bayesian model selection perspective, because
it produces Bayes factors that are sensitive to the value of its only
hyperparameter. Using a large simulation study, we found in our previous work
(Scutari, 2016) that the Bayesian Dirichlet sparse (BDs) score seems to provide
better accuracy in structure learning; in this paper we further show that BDs
does not suffer from the issues above, and we recommend to use it for sparse
data instead of BDeu. Finally, will show that these issues are in fact
different aspects of the same problem and a consequence of the distributional
assumptions of the prior.Comment: 20 pages, 4 figures; extended version submitted to Behaviormetrik
An Empirical-Bayes Score for Discrete Bayesian Networks
Bayesian network structure learning is often performed in a Bayesian setting,
by evaluating candidate structures using their posterior probabilities for a
given data set. Score-based algorithms then use those posterior probabilities
as an objective function and return the maximum a posteriori network as the
learned model. For discrete Bayesian networks, the canonical choice for a
posterior score is the Bayesian Dirichlet equivalent uniform (BDeu) marginal
likelihood with a uniform (U) graph prior (Heckerman et al., 1995). Its
favourable theoretical properties descend from assuming a uniform prior both on
the space of the network structures and on the space of the parameters of the
network. In this paper, we revisit the limitations of these assumptions; and we
introduce an alternative set of assumptions and the resulting score: the
Bayesian Dirichlet sparse (BDs) empirical Bayes marginal likelihood with a
marginal uniform (MU) graph prior. We evaluate its performance in an extensive
simulation study, showing that MU+BDs is more accurate than U+BDeu both in
learning the structure of the network and in predicting new observations, while
not being computationally more complex to estimate.Comment: 12 pages, PGM 201
Exploiting network topology for large-scale inference of nonlinear reaction models
The development of chemical reaction models aids understanding and prediction
in areas ranging from biology to electrochemistry and combustion. A systematic
approach to building reaction network models uses observational data not only
to estimate unknown parameters, but also to learn model structure. Bayesian
inference provides a natural approach to this data-driven construction of
models. Yet traditional Bayesian model inference methodologies that numerically
evaluate the evidence for each model are often infeasible for nonlinear
reaction network inference, as the number of plausible models can be
combinatorially large. Alternative approaches based on model-space sampling can
enable large-scale network inference, but their realization presents many
challenges. In this paper, we present new computational methods that make
large-scale nonlinear network inference tractable. First, we exploit the
topology of networks describing potential interactions among chemical species
to design improved "between-model" proposals for reversible-jump Markov chain
Monte Carlo. Second, we introduce a sensitivity-based determination of move
types which, when combined with network-aware proposals, yields significant
additional gains in sampling performance. These algorithms are demonstrated on
inference problems drawn from systems biology, with nonlinear differential
equation models of species interactions
Labeled Directed Acyclic Graphs: a generalization of context-specific independence in directed graphical models
We introduce a novel class of labeled directed acyclic graph (LDAG) models
for finite sets of discrete variables. LDAGs generalize earlier proposals for
allowing local structures in the conditional probability distribution of a
node, such that unrestricted label sets determine which edges can be deleted
from the underlying directed acyclic graph (DAG) for a given context. Several
properties of these models are derived, including a generalization of the
concept of Markov equivalence classes. Efficient Bayesian learning of LDAGs is
enabled by introducing an LDAG-based factorization of the Dirichlet prior for
the model parameters, such that the marginal likelihood can be calculated
analytically. In addition, we develop a novel prior distribution for the model
structures that can appropriately penalize a model for its labeling complexity.
A non-reversible Markov chain Monte Carlo algorithm combined with a greedy hill
climbing approach is used for illustrating the useful properties of LDAG models
for both real and synthetic data sets.Comment: 26 pages, 17 figure
Bayesian Fused Lasso regression for dynamic binary networks
We propose a multinomial logistic regression model for link prediction in a
time series of directed binary networks. To account for the dynamic nature of
the data we employ a dynamic model for the model parameters that is strongly
connected with the fused lasso penalty. In addition to promoting sparseness,
this prior allows us to explore the presence of change points in the structure
of the network. We introduce fast computational algorithms for estimation and
prediction using both optimization and Bayesian approaches. The performance of
the model is illustrated using simulated data and data from a financial trading
network in the NYMEX natural gas futures market. Supplementary material
containing the trading network data set and code to implement the algorithms is
available online
- …