897 research outputs found

    Combining gene expression data and prior knowledge for inferring gene regulatory networks via Bayesian networks using structural restrictions

    Get PDF
    Ministerio de EconomĂ­a y Competitividad y Fondo Europeo de Desarrollo Regional (FEDER), proyectos TEC2015-69496-R y TIN2016-77902-C3-2-

    Bayesian network learning algorithms using structural restrictions

    Get PDF
    The use of several types of structural restrictions within algorithms for learning Bayesian networks is considered. These restrictions may codify expert knowledge in a given domain, in such a way that a Bayesian network representing this domain should satisfy them. The main goal of this paper is to study whether the algorithms for automatically learning the structure of a Bayesian network from data can obtain better results by using this prior knowledge. Three types of restrictions are formally defined: existence of arcs and/or edges, absence of arcs and/or edges, and ordering restrictions. We analyze the possible interactions between these types of restrictions and also how the restrictions can be managed within Bayesian network learning algorithms based on both the score + search and conditional independence paradigms. Then we particularize our study to two classical learning algorithms: a local search algorithm guided by a scoring function, with the operators of arc addition, arc removal and arc reversal, and the PC algorithm. We also carry out experiments using these two algorithms on several data sets.Spanish Junta de Comunidades de Castilla-La Mancha and Ministerio EducaciĂłn y Ciencia Projects PBC-02-002 and TIN2004- 06204-C03-0

    Operations for Learning with Graphical Models

    Full text link
    This paper is a multidisciplinary review of empirical, statistical learning from a graphical model perspective. Well-known examples of graphical models include Bayesian networks, directed graphs representing a Markov chain, and undirected networks representing a Markov field. These graphical models are extended to model data analysis and empirical learning using the notation of plates. Graphical operations for simplifying and manipulating a problem are provided including decomposition, differentiation, and the manipulation of probability models from the exponential family. Two standard algorithm schemas for learning are reviewed in a graphical framework: Gibbs sampling and the expectation maximization algorithm. Using these operations and schemas, some popular algorithms can be synthesized from their graphical specification. This includes versions of linear regression, techniques for feed-forward networks, and learning Gaussian and discrete Bayesian networks from data. The paper concludes by sketching some implications for data analysis and summarizing how some popular algorithms fall within the framework presented. The main original contributions here are the decomposition techniques and the demonstration that graphical models provide a framework for understanding and developing complex learning algorithms.Comment: See http://www.jair.org/ for any accompanying file

    Decision Making under Uncertainty through Extending Influence Diagrams with Interval-valued Parameters

    Get PDF
    Influence Diagrams (IDs) are one of the most commonly used graphical and mathematical decision models for reasoning under uncertainty. In conventional IDs, both probabilities representing beliefs and utilities representing preferences of decision makers are precise point-valued parameters. However, it is usually difficult or even impossible to directly provide such parameters. In this paper, we extend conventional IDs to allow IDs with interval-valued parameters (IIDs), and develop a counterpart method of Copper’s evaluation method to evaluate IIDs. IIDs avoid the difficulties attached to the specification of precise parameters and provide the capability to model decision making processes in a situation that the precise parameters cannot be specified. The counterpart method to Copper’s evaluation method reduces the evaluation of IIDs into inference problems of IBNs. An algorithm based on the approximate inference of IBNs is proposed, extensive experiments are conducted. The experimental results indicate that the proposed algorithm can find the optimal strategies effectively in IIDs, and the interval-valued expected utilities obtained by proposed algorithm are contained in those obtained by exact evaluating algorithms

    A survey of Bayesian Network structure learning

    Get PDF

    Learning Bayesian network equivalence classes using ant colony optimisation

    Get PDF
    Bayesian networks have become an indispensable tool in the modelling of uncertain knowledge. Conceptually, they consist of two parts: a directed acyclic graph called the structure, and conditional probability distributions attached to each node known as the parameters. As a result of their expressiveness, understandability and rigorous mathematical basis, Bayesian networks have become one of the first methods investigated, when faced with an uncertain problem domain. However, a recurring problem persists in specifying a Bayesian network. Both the structure and parameters can be difficult for experts to conceive, especially if their knowledge is tacit.To counteract these problems, research has been ongoing, on learning both the structure and parameters of Bayesian networks from data. Whilst there are simple methods for learning the parameters, learning the structure has proved harder. Part ofthis stems from the NP-hardness of the problem and the super-exponential space of possible structures. To help solve this task, this thesis seeks to employ a relatively new technique, that has had much success in tackling NP-hard problems. This technique is called ant colony optimisation. Ant colony optimisation is a metaheuristic based on the behaviour of ants acting together in a colony. It uses the stochastic activity of artificial ants to find good solutions to combinatorial optimisation problems. In the current work, this method is applied to the problem of searching through the space of equivalence classes of Bayesian networks, in order to find a good match against a set of data. The system uses operators that evaluate potential modifications to a current state. Each of the modifications is scored and the results used to inform the search. In order to facilitate these steps, other techniques are also devised, to speed up the learning process. The techniques includeThe techniques are tested by sampling data from gold standard networks and learning structures from this sampled data. These structures are analysed using various goodnessof-fit measures to see how well the algorithms perform. The measures include structural similarity metrics and Bayesian scoring metrics. The results are compared in depth against systems that also use ant colony optimisation and other methods, including evolutionary programming and greedy heuristics. Also, comparisons are made to well known state-of-the-art algorithms and a study performed on a real-life data set. The results show favourable performance compared to the other methods and on modelling the real-life data

    Application of Bayesian networks to problems within obesity epidemiology

    Get PDF
    Obesity is a significant public health problem in the United Kingdom and many other parts of the world, including some low-income settings. Although obesity prevalence has been rising for several decades, governments have been slow to implement policies that may have an impact at a population level. Numerous socio-demographic factors have been linked with obesity, but are highly intercorrelated, and identifying relevant factors or at-risk population groups is difficult. This thesis uses a graphical modelling approach, specifically Bayesian networks, to model the joint distribution of socio-demographic factors and obesity related behaviour. The key advantages of graphical models in this context are their ability to model highly correlated data, and to represent complex relationships efficiently as network structure. Three separate pieces of work comprise this thesis. The first uses a sampling technique to identify the networks that best explain the observed data, and employs the common structural features of these networks to infer conditional dependencies present between socio-demographic variables and obesity related behaviour indicators. We find determinants of recreational physical activity differ between males and females, and age and ethnicity have a significant influence on snacking behaviour. The second piece of work usesBayesian networks to build a model of health behaviour given socio demographic input, and then applies this to data from the 2001 census in order to provide an estimate of the health behaviour of a real population. The final analysis uses Bayesian network structure to explore potential determinants of body fat deposition patterns and compares the results tothose derived from a Generalized Linear Model (GLM). Our approach successfully identifies the main determinants, age and Body Mass Index, although is not a genuine alternative due to a lack of sensitivity to less important determinants. Beyond the application to obesity, results of this thesis are of a wider relevance to epidemiology as the field moves towards an increased use of Machine Learning techniques. The work conducted has also met and overcome several technical issues that are likely to be of relevance to others exploring similar approaches.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Surprise: An Alternative Qualitative Uncertainty Model

    Get PDF
    This dissertation embodies a study of the concept of surprise as a base for constructing qualitative calculi for representing and reasoning about uncertain knowledge. Two functions are presented, kappa++} and z, which construct qualitative ranks for events by obtaining the order of magnitude abstraction of the degree of surprise associated with them. The functions use natural numbers to classify events based their associated surprise and aim at providing a ranking that improves those provided by existing ranking functions. This in turn enables the use of such functions in an a la carte probabilistic system where one can choose the level of detail required to represent uncertain knowledge depending on the requirements of the application. The proposed ranking functions are defined along with surprise-update models associated with them. The reasoning mechanisms associated with the functions are developed mathematically and graphically. The advantages and expected limitations of both functions are compared with respect to each other and with existing ranking functions in the context of a bioinformatics application known as \u27\u27reverse engineering of genetic regulatory networks\u27\u27 in which the relations among various genetic components are discovered through the examination of a large amount of collected data. The ranking functions are examined in this context via graphical models which are exclusively developed or this purpose and which utilize the developed functions to represent uncertain knowledge at various levels of details
    • 

    corecore