Search CORE

167 research outputs found

On the use of local search heuristics to improve GES-based Bayesian network learning

Author: Alonso Juan I.
Gámez Jose
Ossa Luis de la
Puerta Jose M.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

Bayesian networks learning is computationally expensive even in the case of sacrificing the optimality of the result. Many methods aim at obtaining quality solutions in affordable times. Most of them are based on local search algorithms, as they allow evaluating candidate networks in a very efficient way, and can be further improved by us ing local search-based metaheuristics to avoid getting stuck in local optima. This approach has been successfully applied in searching for network structures in the space of directed acyclic graphs. Other algorithms search for the networks in the space of equiva lence classes. The most important of these is GES (Greedy Equiv alence Search). It guarantees obtaining the optimal network under certain conditions. However, it can also get stuck in local optima when learning from datasets with limited size. This article proposes the use of local search-based metaheuristics as a way to improve the behaviour of GES in such circumstances. These methods also guar antee asymptotical optimality, and the experiments show that they improve upon the score of the networks obtained with GES

Universidad de Castilla-La Mancha: Repositorio Universitario Institucional de Recursos Abiertos (RUIdeRA)

A Continuation Method for Nash Equilibria in Structured Games

Author: Blum B.
Koller D.
Shelton C. R.
Publication venue: 'AI Access Foundation'
Publication date: 29/09/2011
Field of study

Structured game representations have recently attracted interest as models for multi-agent artificial intelligence scenarios, with rational behavior most commonly characterized by Nash equilibria. This paper presents efficient, exact algorithms for computing Nash equilibria in structured game representations, including both graphical games and multi-agent influence diagrams (MAIDs). The algorithms are derived from a continuation method for normal-form and extensive-form games due to Govindan and Wilson; they follow a trajectory through a space of perturbed games and their equilibria, exploiting game structure through fast computation of the Jacobian of the payoff function. They are theoretically guaranteed to find at least one equilibrium of the game, and may find more. Our approach provides the first efficient algorithm for computing exact equilibria in graphical games with arbitrary topology, and the first algorithm to exploit fine-grained structural properties of MAIDs. Experimental results are presented demonstrating the effectiveness of the algorithms and comparing them to predecessors. The running time of the graphical game algorithm is similar to, and often better than, the running time of previous approximate algorithms. The algorithm for MAIDs can effectively solve games that are much larger than those solvable by previous methods

arXiv.org e-Print Archive

Crossref

A survey of Bayesian Network structure learning

Author: Chobtham K
Constantinou ACC
Guo Z
Kitson NK
Liu Y
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/01/2023
Field of study

Queen Mary Research Online

A scoring function for learning Bayesian networks based on mutual information and conditional independence tests

Author: Campos Ibáñez Luis Miguel
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2006
Field of study

We propose a new scoring function for learning Bayesian networks from data using score+search algorithms. This is based on the concept of mutual information and exploits some well-known properties of this measure in a novel way. Essentially, a statistical independence test based on the chi-square distribution, associated with the mutual information measure, together with a property of additive decomposition of this measure, are combined in order to measure the degree of interaction between each variable and its parent variables in the network. The result is a non-Bayesian scoring function called MIT (mutual information tests) which belongs to the family of scores based on information theory. The MIT score also represents a penalization of the Kullback-Leibler divergence between the joint probability distributions associated with a candidate network and with the available data set. Detailed results of a complete experimental evaluation of the proposed scoring function and its comparison with the well-known K2, BDeu and BIC/MDL scores are also presented.I would like to acknowledge support for this work from the Spanish ‘Consejería de Innovación Ciencia y Empresa de la Junta de Andalucía’, under Project TIC-276

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional Universidad de Granada

Generalized belief change with imprecise probabilities and graphical models

Author: Marchetti Sabina
Publication venue
Publication date: 13/09/2018
Field of study

We provide a theoretical investigation of probabilistic belief revision in complex frameworks, under extended conditions of uncertainty, inconsistency and imprecision. We motivate our kinematical approach by specializing our discussion to probabilistic reasoning with graphical models, whose modular representation allows for efficient inference. Most results in this direction are derived from the relevant work of Chan and Darwiche (2005), that first proved the inter-reducibility of virtual and probabilistic evidence. Such forms of information, deeply distinct in their meaning, are extended to the conditional and imprecise frameworks, allowing further generalizations, e.g. to experts' qualitative assessments. Belief aggregation and iterated revision of a rational agent's belief are also explored

Archivio della ricerca- Università di Roma La Sapienza

Mixed Order Hyper-Networks for Function Approximation and Optimisation

Author: Swingler Kevin
Publication venue: University of Stirling
Publication date: 01/05/2016
Field of study

Many systems take inputs, which can be measured and sometimes controlled, and outputs, which can also be measured and which depend on the inputs. Taking numerous measurements from such systems produces data, which may be used to either model the system with the goal of predicting the output associated with a given input (function approximation, or regression) or of finding the input settings required to produce a desired output (optimisation, or search). Approximating or optimising a function is central to the field of computational intelligence. There are many existing methods for performing regression and optimisation based on samples of data but they all have limitations. Multi layer perceptrons (MLPs) are universal approximators, but they suffer from the black box problem, which means their structure and the function they implement is opaque to the user. They also suffer from a propensity to become trapped in local minima or large plateaux in the error function during learning. A regression method with a structure that allows models to be compared, human knowledge to be extracted, optimisation searches to be guided and model complexity to be controlled is desirable. This thesis presents such as method. This thesis presents a single framework for both regression and optimisation: the mixed order hyper network (MOHN). A MOHN implements a function f:{-1,1}^n ->R to arbitrary precision. The structure of a MOHN makes the ways in which input variables interact to determine the function output explicit, which allows human insights and complexity control that are very difficult in neural networks with hidden units. The explicit structure representation also allows efficient algorithms for searching for an input pattern that leads to a desired output. A number of learning rules for estimating the weights based on a sample of data are presented along with a heuristic method for choosing which connections to include in a model. Several methods for searching a MOHN for inputs that lead to a desired output are compared. Experiments compare a MOHN to an MLP on regression tasks. The MOHN is found to achieve a comparable level of accuracy to an MLP but suffers less from local minima in the error function and shows less variance across multiple training trials. It is also easier to interpret and combine from an ensemble. The trade-off between the fit of a model to its training data and that to an independent set of test data is shown to be easier to control in a MOHN than an MLP. A MOHN is also compared to a number of existing optimisation methods including those using estimation of distribution algorithms, genetic algorithms and simulated annealing. The MOHN is able to find optimal solutions in far fewer function evaluations than these methods on tasks selected from the literature

Stirling Online Research Repository

Learning Bayesian network equivalence classes using ant colony optimisation

Author: Daly Rónán
Publication venue: The University of Edinburgh
Publication date: 01/01/2009
Field of study

Bayesian networks have become an indispensable tool in the modelling of uncertain knowledge. Conceptually, they consist of two parts: a directed acyclic graph called the structure, and conditional probability distributions attached to each node known as the parameters. As a result of their expressiveness, understandability and rigorous mathematical basis, Bayesian networks have become one of the first methods investigated, when faced with an uncertain problem domain. However, a recurring problem persists in specifying a Bayesian network. Both the structure and parameters can be difficult for experts to conceive, especially if their knowledge is tacit.To counteract these problems, research has been ongoing, on learning both the structure and parameters of Bayesian networks from data. Whilst there are simple methods for learning the parameters, learning the structure has proved harder. Part ofthis stems from the NP-hardness of the problem and the super-exponential space of possible structures. To help solve this task, this thesis seeks to employ a relatively new technique, that has had much success in tackling NP-hard problems. This technique is called ant colony optimisation. Ant colony optimisation is a metaheuristic based on the behaviour of ants acting together in a colony. It uses the stochastic activity of artificial ants to find good solutions to combinatorial optimisation problems. In the current work, this method is applied to the problem of searching through the space of equivalence classes of Bayesian networks, in order to find a good match against a set of data. The system uses operators that evaluate potential modifications to a current state. Each of the modifications is scored and the results used to inform the search. In order to facilitate these steps, other techniques are also devised, to speed up the learning process. The techniques includeThe techniques are tested by sampling data from gold standard networks and learning structures from this sampled data. These structures are analysed using various goodnessof-fit measures to see how well the algorithms perform. The measures include structural similarity metrics and Bayesian scoring metrics. The results are compared in depth against systems that also use ant colony optimisation and other methods, including evolutionary programming and greedy heuristics. Also, comparisons are made to well known state-of-the-art algorithms and a study performed on a real-life data set. The results show favourable performance compared to the other methods and on modelling the real-life data

Edinburgh Research Archive

A deterministic inference framework for discrete nonparametric latent variable models:learning complex probabilistic models with simple algorithms

Author: Raykov Yordan
Publication venue
Publication date
Field of study

Latent variable models provide a powerful framework for describing complex data by capturing its structure with a combination of more compact unobserved variables. The Bayesian approach to statistical latent models additionally provides a consistent and principled framework for dealing with uncertainty inherent in the data described with our model. However, in most Bayesian latent variable models we face the limitation that the number of unobserved variables has to be specied a priori. With the increasingly larger and more complex data problems such parametric models fail to make most out of the data available. Any increase in data passed into the model only affects the accuracy of the inferred posteriors and models fail to adapt to adequately capture new arising structure. Flexible Bayesian nonparametric models can mitigate such challenges and allow the learn arbitrarily complex representations given enough data is provided. However,their applications are restricted to applications in which computational resources are plentiful because of the exhaustive sampling methods they require for inference. At the same time we see that in practice despite the large variety of exible models available, simple algorithms such as K-means or Viterbi algorithm remain the preferred tool for most real world applications.This has motivated us in this thesis to borrow the exibility provided by Bayesian nonparametric models,but to derive easy to use, scalable techniques which can be applied to large data problems and can be ran on resource constraint embedded hardware. We propose nonparametric model-based clustering algorithms nearly as simple as K-means which overcome most of its challenges and can infer the number of clusters from the data. Their potential is demonstrated for many different scenarios and applications such as phenotyping Parkinson and Parkisonism related conditions in an unsupervised way. With few simple steps we derive a related approach for nonparametric analysis on longitudinal data which converges few orders of magnitude faster than current available sampling methods. The framework is extended to effcient inference in nonparametric sequential models where example applications can be behaviour extraction and DNA sequencing. We demonstrate that our methods could be easily extended to allow for exible online learning in a realistic setup using severely limited computational resources. We develop a system capable of inferring online nonparametric hidden Markov models from streaming data using only embedded hardware. This allowed us to develop occupancy estimation technology using only a simple motion sensor

Aston Publications Explorer