8 research outputs found

    Modeling hierarchical relationships in epidemiological studies: a Bayesian networks approach

    Get PDF
    Hierarchical relationships between risk factors are seldom taken into account in epidemiological studies though some authors stressed the importance of doing so, and proposed a conceptual framework in which each level of the hierarchy is modeled separately. The objective of this paper was to implement a simple version of their framework, and to propose an alternative procedure based on a Bayesian Network (BN). These approaches were illustrated in modeling the risk of diarrhea infection for 2740 children aged 0 to 59 months in Cameroon. The authors implemented a (naïve) logistic regression, a step-level logistic regression and also a BN. While the first approach is inadequate, the two others approaches both account for the hierarchical structure but to different estimates and interpretations. BN implementation showed that a child in a family in the poorest group has respectively 89%, 40% and 18% probabilities of having poor sanitation, being malnourished and having diarrhea. An advantage of the latter approach is that it enables one to determine the probability that a risk factor (and/or the outcome) is in a given state, given the states of the others. Although the BN considered here is very simple, the method can deal with more complicated models.Bayesian networks; hierarchical model; diarrhea infection; disease determinants; logistic regression

    Feature selection for splice site prediction: A new method using EDA-based feature ranking

    Get PDF
    BACKGROUND: The identification of relevant biological features in large and complex datasets is an important step towards gaining insight in the processes underlying the data. Other advantages of feature selection include the ability of the classification system to attain good or even better solutions using a restricted subset of features, and a faster classification. Thus, robust methods for fast feature selection are of key importance in extracting knowledge from complex biological data. RESULTS: In this paper we present a novel method for feature subset selection applied to splice site prediction, based on estimation of distribution algorithms, a more general framework of genetic algorithms. From the estimated distribution of the algorithm, a feature ranking is derived. Afterwards this ranking is used to iteratively discard features. We apply this technique to the problem of splice site prediction, and show how it can be used to gain insight into the underlying biological process of splicing. CONCLUSION: We show that this technique proves to be more robust than the traditional use of estimation of distribution algorithms for feature selection: instead of returning a single best subset of features (as they normally do) this method provides a dynamical view of the feature selection process, like the traditional sequential wrapper methods. However, the method is faster than the traditional techniques, and scales better to datasets described by a large number of features

    Modeling hierarchical relationships in epidemiological studies: a Bayesian networks approach

    Get PDF
    Hierarchical relationships between risk factors are seldom taken into account in epidemiological studies though some authors stressed the importance of doing so, and proposed a conceptual framework in which each level of the hierarchy is modeled separately. The objective of this paper was to implement a simple version of their framework, and to propose an alternative procedure based on a Bayesian Network (BN). These approaches were illustrated in modeling the risk of diarrhea infection for 2740 children aged 0 to 59 months in Cameroon. The authors implemented a (naïve) logistic regression, a step-level logistic regression and also a BN. While the first approach is inadequate, the two others approaches both account for the hierarchical structure but to different estimates and interpretations. BN implementation showed that a child in a family in the poorest group has respectively 89%, 40% and 18% probabilities of having poor sanitation, being malnourished and having diarrhea. An advantage of the latter approach is that it enables one to determine the probability that a risk factor (and/or the outcome) is in a given state, given the states of the others. Although the BN considered here is very simple, the method can deal with more complicated models

    Modeling hierarchical relationships in epidemiological studies: a Bayesian networks approach

    Get PDF
    Hierarchical relationships between risk factors are seldom taken into account in epidemiological studies though some authors stressed the importance of doing so, and proposed a conceptual framework in which each level of the hierarchy is modeled separately. The objective of this paper was to implement a simple version of their framework, and to propose an alternative procedure based on a Bayesian Network (BN). These approaches were illustrated in modeling the risk of diarrhea infection for 2740 children aged 0 to 59 months in Cameroon. The authors implemented a (naïve) logistic regression, a step-level logistic regression and also a BN. While the first approach is inadequate, the two others approaches both account for the hierarchical structure but to different estimates and interpretations. BN implementation showed that a child in a family in the poorest group has respectively 89%, 40% and 18% probabilities of having poor sanitation, being malnourished and having diarrhea. An advantage of the latter approach is that it enables one to determine the probability that a risk factor (and/or the outcome) is in a given state, given the states of the others. Although the BN considered here is very simple, the method can deal with more complicated models

    Heuristic assignment of CPDs for probabilistic inference in junction trees

    Get PDF
    Many researches have been done for efficient computation of probabilistic queries posed to Bayesian networks (BN). One of the popular architectures for exact inference on BNs is the Junction Tree (JT) based architecture. Among all the different architectures developed, HUGIN is the most efficient JT-based architecture. The Global Propagation (GP) method used in the HUGIN architecture is arguably one of the best methods for probabilistic inference in BNs. Before the propagation, initialization is done to obtain the potential for each cluster in the JT. Then with the GP method, each cluster potential becomes cluster marginal through passing messages with its neighboring clusters. Improvements have been proposed by many researchers to make this message propagation more efficient. Still the GP method can be very slow for dense networks. As BNs are applied to larger, more complex, and realistic applications, developing more efficient inference algorithm has become increasingly important. Towards this goal, in this paper, we present some heuristics for initialization that avoids unnecessary message passing among clusters of the JT and therefore it improves the performance of the architecture by passing lesser messages

    Modellierung und Analyse individuellen Konsumentenverhaltens mit probabilistischen Holonen

    Get PDF
    Der Schwerpunkt dieser Arbeit liegt in der Entwicklung eines agentenbasierten, probabilistischen Konsumentenverhaltensmodells zur Repräsentation und Analyse individuellen Kaufverhaltens. Das Modell dient zur Entscheidungsunterstützung im Handel und speziell im Customer Relationship Management (CRM). Als Modellgrundlage wird eine Klasse probabilistischer Agenten eingeführt, die sich zu Holonen zusammenschließen können und deren Wissensbasen erweiterte Bayes';sche Netze (Verhaltensnetze) sind. Mit Hilfe probabilistischer Holone werden Kundenagenten entwickelt, die einzelne reale Kundenmodellieren. Dazu werden kundenindividuelle Verhaltensmuster unter Berücksichtigung von Domänenwissen aus historischen Kundendaten extrahiert und als nichtlineare Abhängigkeiten zwischen Einflussfaktoren und artikelbezogenen Kundenreaktionen in Verhaltensnetzen repräsentiert. Ein Kundenagent ist dabei ein Holon aus mehreren so genannten Feature-Agenten, die jeweils einzelne Kundeneigenschaften repräsentieren, entsprechende Feature-Verhaltensnetze verwalten und durch Interaktion das Gesamtverhalten des Kunden bestimmen. Die Simulation des Verhaltens besteht aus der Ermittlung von Kundenreaktionen auf vorgegebene Einkaufsszenarien mit Hilfe quantifizierbarer probabilistischer Schlussfolgerungen. Kundenagenten können sich durch Holonisierung zu Kundengruppenagenten zusammenschließen, die unterschiedliche Aggregationen des Kaufverhaltens der Gruppenmitglieder repräsentieren. Zur Bestimmung gleichartiger Kunden werden auf Basis der Verhaltensnetze mehrere Ähnlichkeitsanalyseverfahren sowie verhaltensbezogene Ähnlichkeitsmaße zum Vergleich des dynamischen Kaufverhaltens entwickelt. Bestehende Klassifikations- und Clusteringverfahren werden anschließend so erweitert, dass sie neben klassischen Attributvektoren verhaltensnetzbasierte Repräsentationen als Vergleichsgrundlage verwenden können. Darüber hinaus werden Verfahren zur Zuordnung anonymer Kassenbons zu vorgegebenen Kundengruppen entwickelt, um Ergebnisse von Kundensimulationen auf die Gesamtheit der anonymen Kunden eines Unternehmens übertragen zu können. Nutzen und Qualität der entwickelten Modelle, Verfahren und Maße werden mit Hilfe einer umfangreichen Software-Implementierung anhand mehrerer Anwendungsbeispiele aus der Praxis demonstriert und in einigen Fallstudien evaluiert — basierend auf realen Daten eines deutschen Einzelhandelsunternehmens.The focus of this work is the development of an agent-based, probabilistic model for representing and analysing individual consumer behaviour. The model provides a basis for decision making in marketing and especially in customer relationship management (CRM). As foundation of the model, a class of probabilistic agents is introduced. These agents can be merged to holonic agents (holons) and have probabilistic knowledge bases adapted from Bayesian networks (behaviour networks). An individual customer is modelled as a customer agent which is a probabilistic holon consisting of several feature agents. A feature agent represents a particular property (feature) of the customer';s behaviour and encapsulates appropriate feature-related behaviour networks. The total behaviour of a customer agent is determined by interaction of its feature agents. Individual behaviour patterns of a customer are extracted from real data — in consideration of given domain knowledge — and are represented within behaviour networks as non-linear dependencies between influencing factors and the customer';s product-related reactions. Behaviour simulation is realised by evaluation of expected reactions of customers on given shopping scenarios based on quantifiable, probabilistic reasoning. Customer agents are able to join to customer group agents which represent different behaviour aggregations of their members. Based on behaviour networks, several behaviour-related methods of analysis as well as distance measures are developed to identify homogeneous customers on the basis of their dynamic shopping behaviour. Subsequently, existing vector-based methods of classification and clustering are extended by these behaviour-related methods and measures. In addition, methods are developed to assign anonymous receipts to given customer groups in order to extent customer-related simulation results to anonymous customers of a company. Benefits and quality of the developed models, methods and measures, which are implemented within a complex software system, are shown by practical examples and evaluated in several case studies — based on real data from a German retailer

    Reasoning about causality in games

    Get PDF
    Causal reasoning and game-theoretic reasoning are fundamental topics in artificial intelligence, among many other disciplines: this paper is concerned with their intersection. Despite their importance, a formal framework that supports both these forms of reasoning has, until now, been lacking. We offer a solution in the form of (structural) causal games, which can be seen as extending Pearl's causal hierarchy to the game-theoretic domain, or as extending Koller and Milch's multi-agent influence diagrams to the causal domain. We then consider three key questions: i) How can the (causal) dependencies in games – either between variables, or between strategies – be modelled in a uniform, principled manner? ii) How may causal queries be computed in causal games, and what assumptions does this require? iii) How do causal games compare to existing formalisms? To address question i), we introduce mechanised games, which encode dependencies between agents' decision rules and the distributions governing the game. In response to question ii), we present definitions of predictions, interventions, and counterfactuals, and discuss the assumptions required for each. Regarding question iii), we describe correspondences between causal games and other formalisms, and explain how causal games can be used to answer queries that other causal or game-theoretic models do not support. Finally, we highlight possible applications of causal games, aided by an extensive open-source Python library
    corecore