195 research outputs found
Theory Refinement on Bayesian Networks
Theory refinement is the task of updating a domain theory in the light of new
cases, to be done automatically or with some expert assistance. The problem of
theory refinement under uncertainty is reviewed here in the context of Bayesian
statistics, a theory of belief revision. The problem is reduced to an
incremental learning task as follows: the learning system is initially primed
with a partial theory supplied by a domain expert, and thereafter maintains its
own internal representation of alternative theories which is able to be
interrogated by the domain expert and able to be incrementally refined from
data. Algorithms for refinement of Bayesian networks are presented to
illustrate what is meant by "partial theory", "alternative theory
representation", etc. The algorithms are an incremental variant of batch
learning algorithms from the literature so can work well in batch and
incremental mode.Comment: Appears in Proceedings of the Seventh Conference on Uncertainty in
Artificial Intelligence (UAI1991
Sensitivity Analysis for Probability Assessments in Bayesian Networks
When eliciting probability models from experts, knowledge engineers may
compare the results of the model with expert judgment on test scenarios, then
adjust model parameters to bring the behavior of the model more in line with
the expert's intuition. This paper presents a methodology for analytic
computation of sensitivity values to measure the impact of small changes in a
network parameter on a target probability value or distribution. These values
can be used to guide knowledge elicitation. They can also be used in a gradient
descent algorithm to estimate parameter values that maximize a measure of
goodness-of-fit to both local and holistic probability assessments.Comment: Appears in Proceedings of the Ninth Conference on Uncertainty in
Artificial Intelligence (UAI1993
Fast Learning from Sparse Data
We describe two techniques that significantly improve the running time of
several standard machine-learning algorithms when data is sparse. The first
technique is an algorithm that effeciently extracts one-way and two-way
counts--either real or expected-- from discrete data. Extracting such counts is
a fundamental step in learning algorithms for constructing a variety of models
including decision trees, decision graphs, Bayesian networks, and naive-Bayes
clustering models. The second technique is an algorithm that efficiently
performs the E-step of the EM algorithm (i.e. inference) when applied to a
naive-Bayes clustering model. Using real-world data sets, we demonstrate a
dramatic decrease in running time for algorithms that incorporate these
techniques.Comment: Appears in Proceedings of the Fifteenth Conference on Uncertainty in
Artificial Intelligence (UAI1999
Advances in exact Bayesian structure discovery in Bayesian networks
We consider a Bayesian method for learning the Bayesian network structure
from complete data. Recently, Koivisto and Sood (2004) presented an algorithm
that for any single edge computes its marginal posterior probability in O(n
2^n) time, where n is the number of attributes; the number of parents per
attribute is bounded by a constant. In this paper we show that the posterior
probabilities for all the n (n - 1) potential edges can be computed in O(n 2^n)
total time. This result is achieved by a forward-backward technique and fast
Moebius transform algorithms, which are of independent interest. The resulting
speedup by a factor of about n^2 allows us to experimentally study the
statistical power of learning moderate-size networks. We report results from a
simulation study that covers data sets with 20 to 10,000 records over 5 to 25
discrete attributesComment: Appears in Proceedings of the Twenty-Second Conference on Uncertainty
in Artificial Intelligence (UAI2006
Improved learning of Bayesian networks
The search space of Bayesian Network structures is usually defined as Acyclic
Directed Graphs (DAGs) and the search is done by local transformations of DAGs.
But the space of Bayesian Networks is ordered by DAG Markov model inclusion and
it is natural to consider that a good search policy should take this into
account. First attempt to do this (Chickering 1996) was using equivalence
classes of DAGs instead of DAGs itself. This approach produces better results
but it is significantly slower. We present a compromise between these two
approaches. It uses DAGs to search the space in such a way that the ordering by
inclusion is taken into account. This is achieved by repetitive usage of local
moves within the equivalence class of DAGs. We show that this new approach
produces better results than the original DAGs approach without substantial
change in time complexity. We present empirical results, within the framework
of heuristic search and Markov Chain Monte Carlo, provided through the Alarm
dataset.Comment: Appears in Proceedings of the Seventeenth Conference on Uncertainty
in Artificial Intelligence (UAI2001
A Bayesian Network Scoring Metric That Is Based On Globally Uniform Parameter Priors
We introduce a new Bayesian network (BN) scoring metric called the Global
Uniform (GU) metric. This metric is based on a particular type of default
parameter prior. Such priors may be useful when a BN developer is not willing
or able to specify domain-specific parameter priors. The GU parameter prior
specifies that every prior joint probability distribution P consistent with a
BN structure S is considered to be equally likely. Distribution P is consistent
with S if P includes just the set of independence relations defined by S. We
show that the GU metric addresses some undesirable behavior of the BDeu and K2
Bayesian network scoring metrics, which also use particular forms of default
parameter priors. A closed form formula for computing GU for special classes of
BNs is derived. Efficiently computing GU for an arbitrary BN remains an open
problem.Comment: Appears in Proceedings of the Eighteenth Conference on Uncertainty in
Artificial Intelligence (UAI2002
Treedy: A Heuristic for Counting and Sampling Subsets
Consider a collection of weighted subsets of a ground set N. Given a query
subset Q of N, how fast can one (1) find the weighted sum over all subsets of
Q, and (2) sample a subset of Q proportionally to the weights? We present a
tree-based greedy heuristic, Treedy, that for a given positive tolerance d
answers such counting and sampling queries to within a guaranteed relative
error d and total variation distance d, respectively. Experimental results on
artificial instances and in application to Bayesian structure discovery in
Bayesian networks show that approximations yield dramatic savings in running
time compared to exact computation, and that Treedy typically outperforms a
previously proposed sorting-based heuristic.Comment: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty
in Artificial Intelligence (UAI2013
Learning the Bayesian Network Structure: Dirichlet Prior versus Data
In the Bayesian approach to structure learning of graphical models, the
equivalent sample size (ESS) in the Dirichlet prior over the model parameters
was recently shown to have an important effect on the maximum-a-posteriori
estimate of the Bayesian network structure. In our first contribution, we
theoretically analyze the case of large ESS-values, which complements previous
work: among other results, we find that the presence of an edge in a Bayesian
network is favoured over its absence even if both the Dirichlet prior and the
data imply independence, as long as the conditional empirical distribution is
notably different from uniform. In our second contribution, we focus on
realistic ESS-values, and provide an analytical approximation to the "optimal"
ESS-value in a predictive sense (its accuracy is also validated
experimentally): this approximation provides an understanding as to which
properties of the data have the main effect determining the "optimal"
ESS-value.Comment: Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty
in Artificial Intelligence (UAI2008
Exact Maximum Margin Structure Learning of Bayesian Networks
Recently, there has been much interest in finding globally optimal Bayesian
network structures. These techniques were developed for generative scores and
can not be directly extended to discriminative scores, as desired for
classification. In this paper, we propose an exact method for finding network
structures maximizing the probabilistic soft margin, a successfully applied
discriminative score. Our method is based on branch-and-bound techniques within
a linear programming framework and maintains an any-time solution, together
with worst-case sub-optimality bounds. We apply a set of order constraints for
enforcing the network structure to be acyclic, which allows a compact problem
representation and the use of general-purpose optimization techniques. In
classification experiments, our methods clearly outperform generatively trained
network structures and compete with support vector machines.Comment: ICM
Learning Bayesian Networks with Restricted Causal Interactions
A major problem for the learning of Bayesian networks (BNs) is the
exponential number of parameters needed for conditional probability tables.
Recent research reduces this complexity by modeling local structure in the
probability tables. We examine the use of log-linear local models. While
log-linear models in this context are not new (Whittaker, 1990; Buntine, 1991;
Neal, 1992; Heckerman and Meek, 1997), for structure learning they are
generally subsumed under a naive Bayes model. We describe an alternative
interpretation, and use a Minimum Message Length (MML) (Wallace, 1987) metric
for structure learning of networks exhibiting causal independence, which we
term first-order networks (FONs). We also investigate local model selection on
a node-by-node basis.Comment: Appears in Proceedings of the Fifteenth Conference on Uncertainty in
Artificial Intelligence (UAI1999
- …