25 research outputs found

    Bayesian Tools for Early Disease Detection

    No full text
    LPAI is a (poultry) disease which comes with mild and non-specific clinical symptoms. It has the potential to mutate to HPAI, which is highly contagious and comes with serious clinical symptoms such as high mortality. In 2003, an HPAI outbreak in the Netherlands cost more than a billion Euro, and approximately 30 million animals had to be culled in order to control the outbreak. Such consequences make it important to detect LPAI as early as possible, before it can mutate to HPAI. We have developed and studied multiple Bayesian tools to assist the veterinarian in this task. The early detection of LPAI should start at the poultry farm, where daily measurements of the production parameters would ideally be registered. Important production parameters of a flock of laying hens are the feed intake, water intake, mortality rate, egg production and egg weight. Comparing the measured values against the expected ones is the basis for detecting anomalies in the production parameters. With living animals, such as laying hens, the expected production values differ among flocks due to natural variability. We have developed the pseudo point method to remove this natural variability from the measurements for a production parameter. Based on an expected trend, expressed by a mathematical function, a prediction function is derived. Each measurement is used to update this prediction function, causing it to slowly adapt to the measured data. For a veterinarian visiting a poultry farm, it is very hard to distinguish LPAI from other poultry diseases. We have developed a Bayesian network to assist the veterinarian in determining whether there is an increased probability of the flock being infected with an LPAI virus. This probability is mainly based on farm-specific details, clinical signs on flock level, clinical signs on individual bird level and post-mortem findings. Upon constructing Bayesian networks, the noisy-OR model and its generalisations are frequently used for reducing the number of parameter probabilities to be assessed. Empirical research has shown that using the noisy-OR model for nearly every causal mechanism in a network, does not seriously hamper the quality of its output. We have provided a formal underpinning of this finding and have identified under which conditions applying the noisy-OR model can potentially harm the network's output. Additionally, we have developed the intercausal cancellation model, of which the noisy-OR model is a special case, to provide network engineers with a tool to model cancellation effects between cause variables

    Combining Model-Based EAs for Mixed-Integer Problems

    No full text
    A key characteristic of Mixed-Integer (MI) problems is the presence of both continuous and discrete problem variables. These variables can interact in various ways, resulting in challenging optimization problems. In this paper, we study the design of an algorithm that combines the strengths of LTGA and iAMaLGaM: state-of-the-art model-building EAs designed for discrete and continuous search spaces, respectively. We examine and discuss issues which emerge when trying to integrate those two algorithms into the MI setting. Our considerations lead to a design of a new algorithm for solving MI problems, which we motivate and compare with alternative approaches

    Poset Representations for Sets of Elementary Triplets

    No full text
    Semi-graphoid independence relations, composed of independence triplets, are typically exponentially large in the number of variables involved. For compact representation of such a relation, just a subset of its triplets, called a basis, are listed explicitly, while its other triplets remain implicit through a set of derivation rules. Two types of basis were defined for this purpose, which are the dominant-triplet basis and the elementary-triplet basis, of which the latter is commonly assumed to be significantly larger in size in general. In this paper we introduce the elementary po-triplet as a compact representation of multiple elementary triplets, by using separating posets. By exploiting this new representation, the size of an elementary-triplet basis can be reduced considerably. For computing the elementary closure of a starting set of po-triplets, we present an elegant algorithm that operates on the least and largest elements of the separating posets involved

    Supporting Discussions About Forensic Bayesian Networks Using Argumentation

    No full text
    Bayesian networks (BNs) are powerful tools that are increasingly being used by forensic and legal experts to reason about the uncertain conclusions that can be inferred from the evidence in a case. Although in BN construction it is good practice to document the model itself, the importance of documenting design decisions has received little attention. Such decisions, including the (possibly conflicting) reasons behind them, are important for legal experts to understand and accept probabilistic models of cases. Moreover, when disagreements arise between domain experts involved in the construction of BNs, there are no systematic means to resolve such disagreements. Therefore, we propose an approach that allows domain experts to explicitly express and capture their reasons pro and con modelling decisions using argumentation, and that resolves their disagreements as much as possible. Our approach is based on a case study, in which the argumentation structure of an actual disagreement between two forensic BN experts is analysed

    A new technique of selecting an optimal blocking method for better record linkage

    No full text
    Record linkage, referred to also as entity resolution, is the process of identifying pairs of records representing the same real world entity (e.g. a person) within a dataset or across multiple datasets. In order to reduce the number of record comparisons, record linkage frameworks initially perform a process referred to as blocking, which involves splitting records into a set of blocks using a partition (or blocking) scheme. This restricts comparisons among records that belong to the same block during the linkage process. Existing blocking methods are often evaluated using different metrics and independently of the choice of the subsequent linkage method, which makes the choice of an optimal approach very subjective. In this paper we demonstrate that existing evaluation metrics fail to provide strong evidence to support the selection of an optimal blocking method. We conduct an extensive evaluation of different blocking methods using multiple datasets and some commonly applied linkage techniques to show that evaluation of a blocking method must take into consideration the subsequent linkage phase. We propose a novel evaluation technique that takes into consideration multiple factors including the end-to-end running time of the combined blocking and linkage phases as well as the linkage technique used. We empirically demonstrate using multiple datasets that according to this novel evaluation technique some blocking methods can be fairly considered superior to others, while some should be deemed incomparable according to those factors. Finally, we propose a novel blocking method selection procedure that takes into consideration the linkage proficiency and end-to-end time of different blocking methods combined with a given linkage technique. We show that this technique is able to select the best or near best blocking method for unseen data

    The Multiple Insertion Pyramid: A Fast Parameter-Less Population Scheme.

    No full text
    The Parameter-less Population Pyramid (P3) uses a novel population scheme, called the population pyramid. This population scheme does not require a fixed population size, instead it keeps adding new solutions to an ever growing set of layered populations. P3 is very efficient in terms of number of fitness function evaluations but its run-time is significantly higher than that of the Gene-pool Optimal Mixing Evolutionary Algorithm (GOMEA) which uses the same method of exploration. This higher run-time is caused by the need to rebuild the linkage tree every time a single new solution is added to the population pyramid. We propose a new population scheme, called the multiple insertion pyramid that results in a faster variant of P3 by inserting multiple solutions at the same time and operating on populations instead of on single solutions

    A lattice-based representation of independence relations for efficient closure computation

    No full text
    Independence relations in general include exponentially many statements of independence, that is, exponential in the number of variables involved. These relations are typically characterised however, by a small set of such statements and an associated set of derivation rules. While various computational problems on independence relations can be solved by manipulating these smaller sets without the need to explicitly generate the full relation, existing algorithms for constructing these sets are associated with often prohibitively high running times. In this paper, we introduce a lattice structure for organising sets of independence statements and show that current algorithms are rendered computationally less demanding by exploiting new insights in the structural properties of independence gained from this lattice organisation. By means of a range of experimental results, we subsequently demonstrate that through the lattice organisation indeed a substantial gain in efficiency is achieved for fast-closure computation of semi-graphoid independence relations in practice
    corecore