23,070 research outputs found

    The Bayesian Structural EM Algorithm

    Full text link
    In recent years there has been a flurry of works on learning Bayesian networks from data. One of the hard problems in this area is how to effectively learn the structure of a belief network from incomplete data- that is, in the presence of missing values or hidden variables. In a recent paper, I introduced an algorithm called Structural EM that combines the standard Expectation Maximization (EM) algorithm, which optimizes parameters, with structure search for model selection. That algorithm learns networks based on penalized likelihood scores, which include the BIC/MDL score and various approximations to the Bayesian score. In this paper, I extend Structural EM to deal directly with Bayesian model selection. I prove the convergence of the resulting algorithm and show how to apply it for learning a large class of probabilistic models, including Bayesian networks and some variants thereof.Comment: Appears in Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI1998

    Learning Bayesian Networks from Ordinal Data

    Get PDF
    Bayesian networks are a powerful framework for studying the dependency structure of variables in a complex system. The problem of learning Bayesian networks is tightly associated with the given data type. Ordinal data, such as stages of cancer, rating scale survey questions, and letter grades for exams, are ubiquitous in applied research. However, existing solutions are mainly for continuous and nominal data. In this work, we propose an iterative score-and-search method - called the Ordinal Structural EM (OSEM) algorithm - for learning Bayesian networks from ordinal data. Unlike traditional approaches designed for nominal data, we explicitly respect the ordering amongst the categories. More precisely, we assume that the ordinal variables originate from marginally discretizing a set of Gaussian variables, whose structural dependence in the latent space follows a directed acyclic graph. Then, we adopt the Structural EM algorithm and derive closed-form scoring functions for efficient graph searching. Through simulation studies, we illustrate the superior performance of the OSEM algorithm compared to the alternatives and analyze various factors that may influence the learning accuracy. Finally, we demonstrate the practicality of our method with a real-world application on psychological survey data from 408 patients with co-morbid symptoms of obsessive-compulsive disorder and depression

    Learning the Dimensionality of Hidden Variables

    Full text link
    A serious problem in learning probabilistic models is the presence of hidden variables. These variables are not observed, yet interact with several of the observed variables. Detecting hidden variables poses two problems: determining the relations to other variables in the model and determining the number of states of the hidden variable. In this paper, we address the latter problem in the context of Bayesian networks. We describe an approach that utilizes a score-based agglomerative state-clustering. As we show, this approach allows us to efficiently evaluate models with a range of cardinalities for the hidden variable. We show how to extend this procedure to deal with multiple interacting hidden variables. We demonstrate the effectiveness of this approach by evaluating it on synthetic and real-life data. We show that our approach learns models with hidden variables that generalize better and have better structure than previous approaches.Comment: Appears in Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI2001

    Discovering the Hidden Structure of Complex Dynamic Systems

    Full text link
    Dynamic Bayesian networks provide a compact and natural representation for complex dynamic systems. However, in many cases, there is no expert available from whom a model can be elicited. Learning provides an alternative approach for constructing models of dynamic systems. In this paper, we address some of the crucial computational aspects of learning the structure of dynamic systems, particularly those where some relevant variables are partially observed or even entirely unknown. Our approach is based on the Structural Expectation Maximization (SEM) algorithm. The main computational cost of the SEM algorithm is the gathering of expected sufficient statistics. We propose a novel approximation scheme that allows these sufficient statistics to be computed efficiently. We also investigate the fundamental problem of discovering the existence of hidden variables without exhaustive and expensive search. Our approach is based on the observation that, in dynamic systems, ignoring a hidden variable typically results in a violation of the Markov property. Thus, our algorithm searches for such violations in the data, and introduces hidden variables to explain them. We provide empirical results showing that the algorithm is able to learn the dynamics of complex systems in a computationally tractable way.Comment: Appears in Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI1999

    A Robust Independence Test for Constraint-Based Learning of Causal Structure

    Full text link
    Constraint-based (CB) learning is a formalism for learning a causal network with a database D by performing a series of conditional-independence tests to infer structural information. This paper considers a new test of independence that combines ideas from Bayesian learning, Bayesian network inference, and classical hypothesis testing to produce a more reliable and robust test. The new test can be calculated in the same asymptotic time and space required for the standard tests such as the chi-squared test, but it allows the specification of a prior distribution over parameters and can be used when the database is incomplete. We prove that the test is correct, and we demonstrate empirically that, when used with a CB causal discovery algorithm with noninformative priors, it recovers structural features more reliably and it produces networks with smaller KL-Divergence, especially as the number of nodes increases or the number of records decreases. Another benefit is the dramatic reduction in the probability that a CB algorithm will stall during the search, providing a remedy for an annoying problem plaguing CB learning when the database is small.Comment: Appears in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003

    Learning the Structure of Dynamic Probabilistic Networks

    Full text link
    Dynamic probabilistic networks are a compact representation of complex stochastic processes. In this paper we examine how to learn the structure of a DPN from data. We extend structure scoring rules for standard probabilistic networks to the dynamic case, and show how to search for structure when some of the variables are hidden. Finally, we examine two applications where such a technology might be useful: predicting and classifying dynamic behaviors, and learning causal orderings in biological processes. We provide empirical results that demonstrate the applicability of our methods in both domains.Comment: Appears in Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI1998

    Dynamic Sparse Factor Analysis

    Full text link
    Its conceptual appeal and effectiveness has made latent factor modeling an indispensable tool for multivariate analysis. Despite its popularity across many fields, there are outstanding methodological challenges that have hampered practical deployments. One major challenge is the selection of the number of factors, which is exacerbated for dynamic factor models, where factors can disappear, emerge, and/or reoccur over time. Existing tools that assume a fixed number of factors may provide a misguided representation of the data mechanism, especially when the number of factors is crudely misspecified. Another challenge is the interpretability of the factor structure, which is often regarded as an unattainable objective due to the lack of identifiability. Motivated by a topical macroeconomic application, we develop a flexible Bayesian method for dynamic factor analysis (DFA) that can simultaneously accommodate a time-varying number of factors and enhance interpretability without strict identifiability constraints. To this end, we turn to dynamic sparsity by employing Dynamic Spike-and-Slab (DSS) priors within DFA. Scalable Bayesian EM estimation is proposed for fast posterior mode identification via rotations to sparsity, enabling Bayesian data analysis at scales that would have been previously time-consuming. We study a large-scale balanced panel of macroeconomic variables covering multiple facets of the US economy, with a focus on the Great Recession, to highlight the efficacy and usefulness of our proposed method

    Learning Bayesian Networks from Incomplete Data with the Node-Average Likelihood

    Full text link
    Bayesian network (BN) structure learning from complete data has been extensively studied in the literature. However, fewer theoretical results are available for incomplete data, and most are related to the Expectation-Maximisation (EM) algorithm. Balov (2013) proposed an alternative approach called Node-Average Likelihood (NAL) that is competitive with EM but computationally more efficient; and he proved its consistency and model identifiability for discrete BNs. In this paper, we give general sufficient conditions for the consistency of NAL; and we prove consistency and identifiability for conditional Gaussian BNs, which include discrete and Gaussian BNs as special cases. Furthermore, we confirm our results and the results in Balov (2013) with an independent simulation study. Hence we show that NAL has a much wider applicability than originally implied in Balov (2013), and that it is competitive with EM for conditional Gaussian BNs as well.Comment: 27 pages, 5 figure

    Multiple models of Bayesian networks applied to offline recognition of Arabic handwritten city names

    Full text link
    In this paper we address the problem of offline Arabic handwriting word recognition. Off-line recognition of handwritten words is a difficult task due to the high variability and uncertainty of human writing. The majority of the recent systems are constrained by the size of the lexicon to deal with and the number of writers. In this paper, we propose an approach for multi-writers Arabic handwritten words recognition using multiple Bayesian networks. First, we cut the image in several blocks. For each block, we compute a vector of descriptors. Then, we use K-means to cluster the low-level features including Zernik and Hu moments. Finally, we apply four variants of Bayesian networks classifiers (Na\"ive Bayes, Tree Augmented Na\"ive Bayes (TAN), Forest Augmented Na\"ive Bayes (FAN) and DBN (dynamic bayesian network) to classify the whole image of tunisian city name. The results demonstrate FAN and DBN outperform good recognition ratesComment: arXiv admin note: substantial text overlap with arXiv:1204.167
    • …
    corecore