23,070 research outputs found
The Bayesian Structural EM Algorithm
In recent years there has been a flurry of works on learning Bayesian
networks from data. One of the hard problems in this area is how to effectively
learn the structure of a belief network from incomplete data- that is, in the
presence of missing values or hidden variables. In a recent paper, I introduced
an algorithm called Structural EM that combines the standard Expectation
Maximization (EM) algorithm, which optimizes parameters, with structure search
for model selection. That algorithm learns networks based on penalized
likelihood scores, which include the BIC/MDL score and various approximations
to the Bayesian score. In this paper, I extend Structural EM to deal directly
with Bayesian model selection. I prove the convergence of the resulting
algorithm and show how to apply it for learning a large class of probabilistic
models, including Bayesian networks and some variants thereof.Comment: Appears in Proceedings of the Fourteenth Conference on Uncertainty in
Artificial Intelligence (UAI1998
Learning Bayesian Networks from Ordinal Data
Bayesian networks are a powerful framework for studying the dependency structure of variables in a complex system. The problem of learning Bayesian networks is tightly associated with the given data type. Ordinal data, such as stages of cancer, rating scale survey questions, and letter grades for exams, are ubiquitous in applied research. However, existing solutions are mainly for continuous and nominal data. In this work, we propose an iterative score-and-search method - called the Ordinal Structural EM (OSEM) algorithm - for learning Bayesian networks from ordinal data. Unlike traditional approaches designed for nominal data, we explicitly respect the ordering amongst the categories. More precisely, we assume that the ordinal variables originate from marginally discretizing a set of Gaussian variables, whose structural dependence in the latent space follows a directed acyclic graph. Then, we adopt the Structural EM algorithm and derive closed-form scoring functions for efficient graph searching. Through simulation studies, we illustrate the superior performance of the OSEM algorithm compared to the alternatives and analyze various factors that may influence the learning accuracy. Finally, we demonstrate the practicality of our method with a real-world application on psychological survey data from 408 patients with co-morbid symptoms of obsessive-compulsive disorder and depression
Learning the Dimensionality of Hidden Variables
A serious problem in learning probabilistic models is the presence of hidden
variables. These variables are not observed, yet interact with several of the
observed variables. Detecting hidden variables poses two problems: determining
the relations to other variables in the model and determining the number of
states of the hidden variable. In this paper, we address the latter problem in
the context of Bayesian networks. We describe an approach that utilizes a
score-based agglomerative state-clustering. As we show, this approach allows us
to efficiently evaluate models with a range of cardinalities for the hidden
variable. We show how to extend this procedure to deal with multiple
interacting hidden variables. We demonstrate the effectiveness of this approach
by evaluating it on synthetic and real-life data. We show that our approach
learns models with hidden variables that generalize better and have better
structure than previous approaches.Comment: Appears in Proceedings of the Seventeenth Conference on Uncertainty
in Artificial Intelligence (UAI2001
Recommended from our members
An integrated hydrologic Bayesian multimodel combination framework: Confronting input, parameter, and model structural uncertainty in hydrologic prediction
The conventional treatment of uncertainty in rainfall-runoff modeling primarily attributes uncertainty in the input-output representation of the model to uncertainty in the model parameters without explicitly addressing the input, output, and model structural uncertainties. This paper presents a new framework, the Integrated Bayesian Uncertainty Estimator (IBUNE), to account for the major uncertainties of hydrologic rainfall-runoff predictions explicitly. IBUNE distinguishes between the various sources of uncertainty including parameter, input, and model structural uncertainty. An input error model in the form of a Gaussian multiplier has been introduced within IBUNE. These multipliers are assumed to be drawn from an identical distribution with an unknown mean and variance which were estimated along with other hydrological model parameters by a Monte Carlo Markov Chain (MCMC) scheme. IBUNE also includes the Bayesian model averaging (BMA) scheme which is employed to further improve the prediction skill and address model structural uncertainty using multiple model outputs. A series of case studies using three rainfall-runoff models to predict the streamflow in the Leaf River basin, Mississippi, are used to examine the necessity and usefulness of this technique. The results suggest that ignoring either input forcings error or model structural uncertainty will lead to unrealistic model simulations and incorrect uncertainty bounds. Copyright 2007 by the American Geophysical Union
Discovering the Hidden Structure of Complex Dynamic Systems
Dynamic Bayesian networks provide a compact and natural representation for
complex dynamic systems. However, in many cases, there is no expert available
from whom a model can be elicited. Learning provides an alternative approach
for constructing models of dynamic systems. In this paper, we address some of
the crucial computational aspects of learning the structure of dynamic systems,
particularly those where some relevant variables are partially observed or even
entirely unknown. Our approach is based on the Structural Expectation
Maximization (SEM) algorithm. The main computational cost of the SEM algorithm
is the gathering of expected sufficient statistics. We propose a novel
approximation scheme that allows these sufficient statistics to be computed
efficiently. We also investigate the fundamental problem of discovering the
existence of hidden variables without exhaustive and expensive search. Our
approach is based on the observation that, in dynamic systems, ignoring a
hidden variable typically results in a violation of the Markov property. Thus,
our algorithm searches for such violations in the data, and introduces hidden
variables to explain them. We provide empirical results showing that the
algorithm is able to learn the dynamics of complex systems in a computationally
tractable way.Comment: Appears in Proceedings of the Fifteenth Conference on Uncertainty in
Artificial Intelligence (UAI1999
A Robust Independence Test for Constraint-Based Learning of Causal Structure
Constraint-based (CB) learning is a formalism for learning a causal network
with a database D by performing a series of conditional-independence tests to
infer structural information. This paper considers a new test of independence
that combines ideas from Bayesian learning, Bayesian network inference, and
classical hypothesis testing to produce a more reliable and robust test. The
new test can be calculated in the same asymptotic time and space required for
the standard tests such as the chi-squared test, but it allows the
specification of a prior distribution over parameters and can be used when the
database is incomplete. We prove that the test is correct, and we demonstrate
empirically that, when used with a CB causal discovery algorithm with
noninformative priors, it recovers structural features more reliably and it
produces networks with smaller KL-Divergence, especially as the number of nodes
increases or the number of records decreases. Another benefit is the dramatic
reduction in the probability that a CB algorithm will stall during the search,
providing a remedy for an annoying problem plaguing CB learning when the
database is small.Comment: Appears in Proceedings of the Nineteenth Conference on Uncertainty in
Artificial Intelligence (UAI2003
Learning the Structure of Dynamic Probabilistic Networks
Dynamic probabilistic networks are a compact representation of complex
stochastic processes. In this paper we examine how to learn the structure of a
DPN from data. We extend structure scoring rules for standard probabilistic
networks to the dynamic case, and show how to search for structure when some of
the variables are hidden. Finally, we examine two applications where such a
technology might be useful: predicting and classifying dynamic behaviors, and
learning causal orderings in biological processes. We provide empirical results
that demonstrate the applicability of our methods in both domains.Comment: Appears in Proceedings of the Fourteenth Conference on Uncertainty in
Artificial Intelligence (UAI1998
Dynamic Sparse Factor Analysis
Its conceptual appeal and effectiveness has made latent factor modeling an
indispensable tool for multivariate analysis. Despite its popularity across
many fields, there are outstanding methodological challenges that have hampered
practical deployments. One major challenge is the selection of the number of
factors, which is exacerbated for dynamic factor models, where factors can
disappear, emerge, and/or reoccur over time. Existing tools that assume a fixed
number of factors may provide a misguided representation of the data mechanism,
especially when the number of factors is crudely misspecified. Another
challenge is the interpretability of the factor structure, which is often
regarded as an unattainable objective due to the lack of identifiability.
Motivated by a topical macroeconomic application, we develop a flexible
Bayesian method for dynamic factor analysis (DFA) that can simultaneously
accommodate a time-varying number of factors and enhance interpretability
without strict identifiability constraints. To this end, we turn to dynamic
sparsity by employing Dynamic Spike-and-Slab (DSS) priors within DFA. Scalable
Bayesian EM estimation is proposed for fast posterior mode identification via
rotations to sparsity, enabling Bayesian data analysis at scales that would
have been previously time-consuming. We study a large-scale balanced panel of
macroeconomic variables covering multiple facets of the US economy, with a
focus on the Great Recession, to highlight the efficacy and usefulness of our
proposed method
Learning Bayesian Networks from Incomplete Data with the Node-Average Likelihood
Bayesian network (BN) structure learning from complete data has been
extensively studied in the literature. However, fewer theoretical results are
available for incomplete data, and most are related to the
Expectation-Maximisation (EM) algorithm. Balov (2013) proposed an alternative
approach called Node-Average Likelihood (NAL) that is competitive with EM but
computationally more efficient; and he proved its consistency and model
identifiability for discrete BNs.
In this paper, we give general sufficient conditions for the consistency of
NAL; and we prove consistency and identifiability for conditional Gaussian BNs,
which include discrete and Gaussian BNs as special cases. Furthermore, we
confirm our results and the results in Balov (2013) with an independent
simulation study. Hence we show that NAL has a much wider applicability than
originally implied in Balov (2013), and that it is competitive with EM for
conditional Gaussian BNs as well.Comment: 27 pages, 5 figure
Multiple models of Bayesian networks applied to offline recognition of Arabic handwritten city names
In this paper we address the problem of offline Arabic handwriting word
recognition. Off-line recognition of handwritten words is a difficult task due
to the high variability and uncertainty of human writing. The majority of the
recent systems are constrained by the size of the lexicon to deal with and the
number of writers. In this paper, we propose an approach for multi-writers
Arabic handwritten words recognition using multiple Bayesian networks. First,
we cut the image in several blocks. For each block, we compute a vector of
descriptors. Then, we use K-means to cluster the low-level features including
Zernik and Hu moments. Finally, we apply four variants of Bayesian networks
classifiers (Na\"ive Bayes, Tree Augmented Na\"ive Bayes (TAN), Forest
Augmented Na\"ive Bayes (FAN) and DBN (dynamic bayesian network) to classify
the whole image of tunisian city name. The results demonstrate FAN and DBN
outperform good recognition ratesComment: arXiv admin note: substantial text overlap with arXiv:1204.167
- …