Search CORE

109,341 research outputs found

Learning Bayesian Networks with Incomplete Data by Augmentation

Author: Adel Tameem
de Campos Cassio P.
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 08/10/2016
Field of study

We present new algorithms for learning Bayesian networks from data with missing values using a data augmentation approach. An exact Bayesian network learning algorithm is obtained by recasting the problem into a standard Bayesian network learning problem without missing data. To the best of our knowledge, this is the first exact algorithm for this problem. As expected, the exact algorithm does not scale to large domains. We build on the exact method to create an approximate algorithm using a hill-climbing technique. This algorithm scales to large domains so long as a suitable standard structure learning method for complete data is available. We perform a wide range of experiments to demonstrate the benefits of learning Bayesian networks with such new approach

arXiv.org e-Print Archive

Queen's University Belfast Research Portal

Enlighten

Association for the Advancement of Artificial Intelligence: AAAI Publications

Large-scale empirical validation of Bayesian Network structure learning algorithms with noisy data.

Author: Chobtham K
Constantinou AC
Guo Z
Kitson NK
Liu Y
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

Numerous Bayesian Network (BN) structure learning algorithms have been proposed in the literature over the past few decades. Each publication makes an empirical or theoretical case for the algorithm proposed in that publication and results across studies are often inconsistent in their claims about which algorithm is ‘best’. This is partly because there is no agreed evaluation approach to determine their effectiveness. Moreover, each algorithm is based on a set of assumptions, such as complete data and causal sufficiency, and tend to be evaluated with data that conforms to these assumptions, however unrealistic these assumptions may be in the real world. As a result, it is widely accepted that synthetic performance overestimates real performance, although to what degree this may happen remains unknown. This paper investigates the performance of 15 state-of-the-art, well-established, or recent promising structure learning algorithms. We propose a methodology that applies the algorithms to data that incorporates synthetic noise, in an effort to better understand the performance of structure learning algorithms when applied to real data. Each algorithm is tested over multiple case studies, sample sizes, types of noise, and assessed with multiple evaluation criteria. This work involved learning approximately 10,000 graphs with a total structure learning runtime of seven months. In investigating the impact of data noise, we provide the first large scale empirical comparison of BN structure learning algorithms under different assumptions of data noise. The results suggest that traditional synthetic performance may overestimate real-world performance by anywhere between 10% and more than 50%. They also show that while score-based learning is generally superior to constraint-based learning, a higher fitting score does not necessarily imply a more accurate causal graph. The comparisons extend to other outcomes of interest, such as runtime, reliability, and resilience to noise, assessed over both small and large networks, and with both limited and big data. To facilitate comparisons with future studies, we have made all data, raw results, graphs and BN models freely available online

Queen Mary Research Online

Infinite Multiple Membership Relational Modeling for Complex Networks

Author: Hansen Lars Kai
Mørup Morten
Schmidt Mikkel N.
Publication venue
Publication date: 01/01/2010
Field of study

Learning latent structure in complex networks has become an important problem fueled by many types of networked data originating from practically all fields of science. In this paper, we propose a new non-parametric Bayesian multiple-membership latent feature model for networks. Contrary to existing multiple-membership models that scale quadratically in the number of vertices the proposed model scales linearly in the number of links admitting multiple-membership analysis in large scale networks. We demonstrate a connection between the single membership relational model and multiple membership models and show on "real" size benchmark network data that accounting for multiple memberships improves the learning of latent structure as measured by link prediction while explicitly accounting for multiple membership result in a more compact representation of the latent structure of networks.Comment: 8 pages, 4 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Online Research Database In Technology

Reconstructing gene regulatory network using heterogeneous biological data

Author: Ahmad Farzana Kabir
Yusoff Nooraini
Publication venue
Publication date: 01/01/2013
Field of study

Gene regulatory network is a model of a network that describes the relationships among genes In a given condition. However, constructing gene regulatory network is a complicated task as high-throughput technologies generate large-scale of data compared to number of sample.In addition, the data involves a substantial amount of noise and false positive results that hinder the downstream analysis performance.To address these problems Bayesian network model has attracted the most attention. However, the key challenge in using Bayesian network to mode1 GRN is related to its learning structure.Bayesian network structure learning is NP-hard and computationally complex. Therefore. this research aims to address the issue related to Bayesian network structure learning by proposing a low-order conditional independence method.In addition we revised the gene regulatory relationships by integrating biological heterogeneous dataset to extract transcription factors for regulator, and target genes.The empirical results indicate that proposed method works better with biological knowledge processing with a precision of 83.3% in comparison to a network that rely on microarray only, which achieved correctness of 80.85

UUM Repository

Learning Large-Scale Bayesian Networks with the sparsebn Package

Author: Aragam Bryon
Gu Jiaying
Zhou Qing
Publication venue: 'Foundation for Open Access Statistic'
Publication date: 10/03/2018
Field of study

Learning graphical models from data is an important problem with wide applications, ranging from genomics to the social sciences. Nowadays datasets often have upwards of thousands---sometimes tens or hundreds of thousands---of variables and far fewer samples. To meet this challenge, we have developed a new R package called sparsebn for learning the structure of large, sparse graphical models with a focus on Bayesian networks. While there are many existing software packages for this task, this package focuses on the unique setting of learning large networks from high-dimensional data, possibly with interventions. As such, the methods provided place a premium on scalability and consistency in a high-dimensional setting. Furthermore, in the presence of interventions, the methods implemented here achieve the goal of learning a causal network from data. Additionally, the sparsebn package is fully compatible with existing software packages for network analysis.Comment: To appear in the Journal of Statistical Software, 39 pages, 7 figure

arXiv.org e-Print Archive

Journal of Statistical Software

Recommended from our members

Distributed Bayesian Computation and Self-Organized Learning in Sheets of Spiking Neurons with Local Lateral Inhibition

Author: Bill Johannes
Buesing Lars
Habenschuss Stefan
Legenstein Robert
Maass Wolfgang
Nessler Bernhard
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2015
Field of study

During the last decade, Bayesian probability theory has emerged as a framework in cognitive science and neuroscience for describing perception, reasoning and learning of mammals. However, our understanding of how probabilistic computations could be organized in the brain, and how the observed connectivity structure of cortical microcircuits supports these calculations, is rudimentary at best. In this study, we investigate statistical inference and self-organized learning in a spatially extended spiking network model, that accommodates both local competitive and large-scale associative aspects of neural information processing, under a unified Bayesian account. Specifically, we show how the spiking dynamics of a recurrent network with lateral excitation and local inhibition in response to distributed spiking input, can be understood as sampling from a variational posterior distribution of a well-defined implicit probabilistic model. This interpretation further permits a rigorous analytical treatment of experience-dependent plasticity on the network level. Using machine learning theory, we derive update rules for neuron and synapse parameters which equate with Hebbian synaptic and homeostatic intrinsic plasticity rules in a neural implementation. In computer simulations, we demonstrate that the interplay of these plasticity rules leads to the emergence of probabilistic local experts that form distributed assemblies of similarly tuned cells communicating through lateral excitatory connections. The resulting sparse distributed spike code of a well-adapted network carries compressed information on salient input features combined with prior experience on correlations among them. Our theory predicts that the emergence of such efficient representations benefits from network architectures in which the range of local inhibition matches the spatial extent of pyramidal cells that share common afferent input

Columbia University Academic Commons

Directory of Open Access Journals

PubMed Central

Hochschulschriftenserver - Universität Frankfurt am Main

FigShare