25 research outputs found
A Research Agenda for Hybrid Intelligence:Augmenting Human Intellect With Collaborative, Adaptive, Responsible, and Explainable Artificial Intelligence
We define hybrid intelligence (HI) as the combination of human and machine intelligence, augmenting human intellect and capabilities instead of replacing them and achieving goals that were unreachable by either humans or machines. HI is an important new research focus for artificial intelligence, and we set a research agenda for HI by formulating four challenges
Bayesian Tools for Early Disease Detection
LPAI is a (poultry) disease which comes with mild and non-specific clinical symptoms. It has the potential to mutate to HPAI, which is highly contagious and comes with serious clinical symptoms such as high mortality. In 2003, an HPAI outbreak in the Netherlands cost more than a billion Euro, and approximately 30 million animals had to be culled in order to control the outbreak. Such consequences make it important to detect LPAI as early as possible, before it can mutate to HPAI. We have developed and studied multiple Bayesian tools to assist the veterinarian in this task. The early detection of LPAI should start at the poultry farm, where daily measurements of the production parameters would ideally be registered. Important production parameters of a flock of laying hens are the feed intake, water intake, mortality rate, egg production and egg weight. Comparing the measured values against the expected ones is the basis for detecting anomalies in the production parameters. With living animals, such as laying hens, the expected production values differ among flocks due to natural variability. We have developed the pseudo point method to remove this natural variability from the measurements for a production parameter. Based on an expected trend, expressed by a mathematical function, a prediction function is derived. Each measurement is used to update this prediction function, causing it to slowly adapt to the measured data. For a veterinarian visiting a poultry farm, it is very hard to distinguish LPAI from other poultry diseases. We have developed a Bayesian network to assist the veterinarian in determining whether there is an increased probability of the flock being infected with an LPAI virus. This probability is mainly based on farm-specific details, clinical signs on flock level, clinical signs on individual bird level and post-mortem findings. Upon constructing Bayesian networks, the noisy-OR model and its generalisations are frequently used for reducing the number of parameter probabilities to be assessed. Empirical research has shown that using the noisy-OR model for nearly every causal mechanism in a network, does not seriously hamper the quality of its output. We have provided a formal underpinning of this finding and have identified under which conditions applying the noisy-OR model can potentially harm the network's output. Additionally, we have developed the intercausal cancellation model, of which the noisy-OR model is a special case, to provide network engineers with a tool to model cancellation effects between cause variables
Combining Model-Based EAs for Mixed-Integer Problems
A key characteristic of Mixed-Integer (MI) problems is the presence of both continuous and discrete problem variables. These variables can interact in various ways, resulting in challenging optimization problems. In this paper, we study the design of an algorithm that combines the strengths of LTGA and iAMaLGaM: state-of-the-art model-building EAs designed for discrete and continuous search spaces, respectively. We examine and discuss issues which emerge when trying to integrate those two algorithms into the MI setting. Our considerations lead to a design of a new algorithm for solving MI problems, which we motivate and compare with alternative approaches
Poset Representations for Sets of Elementary Triplets
Semi-graphoid independence relations, composed of independence triplets, are typically exponentially large in the number of variables involved. For compact representation of such a relation, just a subset of its triplets, called a basis, are listed explicitly, while its other triplets remain implicit through a set of derivation rules. Two types of basis were defined for this purpose, which are the dominant-triplet basis and the elementary-triplet basis, of which the latter is commonly assumed to be significantly larger in size in general. In this paper we introduce the elementary po-triplet as a compact representation of multiple elementary triplets, by using separating posets. By exploiting this new representation, the size of an elementary-triplet basis can be reduced considerably. For computing the elementary closure of a starting set of po-triplets, we present an elegant algorithm that operates on the least and largest elements of the separating posets involved
Supporting Discussions About Forensic Bayesian Networks Using Argumentation
Bayesian networks (BNs) are powerful tools that are increasingly being used by forensic and legal experts to reason about the uncertain conclusions that can be inferred from the evidence in a case. Although in BN construction it is good practice to document the model itself, the importance of documenting design decisions has received little attention. Such decisions, including the (possibly conflicting) reasons behind them, are important for legal experts to understand and accept probabilistic models of cases. Moreover, when disagreements arise between domain experts involved in the construction of BNs, there are no systematic means to resolve such disagreements. Therefore, we propose an approach that allows domain experts to explicitly express and capture their reasons pro and con modelling decisions using argumentation, and that resolves their disagreements as much as possible. Our approach is based on a case study, in which the argumentation structure of an actual disagreement between two forensic BN experts is analysed
A new technique of selecting an optimal blocking method for better record linkage
Record linkage, referred to also as entity resolution, is the process of identifying pairs of records representing the same real world entity (e.g. a person) within a dataset or across multiple datasets. In order to reduce the number of record comparisons, record linkage frameworks initially perform a process referred to as blocking, which involves splitting records into a set of blocks using a partition (or blocking) scheme. This restricts comparisons among records that belong to the same block during the linkage process. Existing blocking methods are often evaluated using different metrics and independently of the choice of the subsequent linkage method, which makes the choice of an optimal approach very subjective. In this paper we demonstrate that existing evaluation metrics fail to provide strong evidence to support the selection of an optimal blocking method. We conduct an extensive evaluation of different blocking methods using multiple datasets and some commonly applied linkage techniques to show that evaluation of a blocking method must take into consideration the subsequent linkage phase. We propose a novel evaluation technique that takes into consideration multiple factors including the end-to-end running time of the combined blocking and linkage phases as well as the linkage technique used. We empirically demonstrate using multiple datasets that according to this novel evaluation technique some blocking methods can be fairly considered superior to others, while some should be deemed incomparable according to those factors. Finally, we propose a novel blocking method selection procedure that takes into consideration the linkage proficiency and end-to-end time of different blocking methods combined with a given linkage technique. We show that this technique is able to select the best or near best blocking method for unseen data
The Multiple Insertion Pyramid: A Fast Parameter-Less Population Scheme.
The Parameter-less Population Pyramid (P3) uses a novel population scheme, called the population pyramid. This population scheme does not require a fixed population size, instead it keeps adding new solutions to an ever growing set of layered populations. P3 is very efficient in terms of number of fitness function evaluations but its run-time is significantly higher than that of the Gene-pool Optimal Mixing Evolutionary Algorithm (GOMEA) which uses the same method of exploration. This higher run-time is caused by the need to rebuild the linkage tree every time a single new solution is added to the population pyramid. We propose a new population scheme, called the multiple insertion pyramid that results in a faster variant of P3 by inserting multiple solutions at the same time and operating on populations instead of on single solutions
Balanced sensitivity functions for tuning multi-dimensional Bayesian network classifiers
Multi-dimensional Bayesian network classifiers are Bayesian networks of restricted topological structure, which are tailored to classifying data instances into multiple dimensions. Like more traditional classifiers, multi-dimensional classifiers are typically learned from data and may include inaccuracies in their parameter probabilities. We will show that the graphical properties and dedicated use of these classifiers induce higher-order sensitivity functions of a highly constrained functional form in these parameters. We then introduce the concept of balanced sensitivity function in which multiple parameters are functionally related by the odds ratios of their original and new values, and argue that these functions provide for a suitable heuristic for tuning multi-dimensional classifiers with guaranteed bounds on the effects on their output probabilities. We demonstrate the practicability of our heuristic by means of a classifier for a real-world application in the veterinary field