79 research outputs found
Learning Bayesian Networks That Enable Full Propagation of Evidence
This paper builds on recent developments in Bayesian network (BN) structure
learning under the controversial assumption that the input variables are
dependent. This assumption can be viewed as a learning constraint geared
towards cases where the input variables are known or assumed to be dependent.
It addresses the problem of learning multiple disjoint subgraphs that do not
enable full propagation of evidence. This problem is highly prevalent in cases
where the sample size of the input data is low with respect to the
dimensionality of the model, which is often the case when working with real
data. The paper presents a novel hybrid structure learning algorithm, called
SaiyanH, that addresses this issue. The results show that this constraint helps
the algorithm to estimate the number of true edges with higher accuracy
compared to the state-of-the-art. Out of the 13 algorithms investigated, the
results rank SaiyanH 4th in reconstructing the true DAG, with accuracy scores
lower by 8.1% (F1), 10.2% (BSF), and 19.5% (SHD) compared to the top ranked
algorithm, and higher by 75.5% (F1), 118% (BSF), and 4.3% (SHD) compared to the
bottom ranked algorithm. Overall, the results suggest that the proposed
algorithm discovers satisfactorily accurate connected DAGs in cases where other
algorithms produce multiple disjoint subgraphs that often underfit the true
graph
Bayesian network structure learning with causal effects in the presence of latent variables.
Latent variables may lead to spurious relationships that can be
misinterpreted as causal relationships. In Bayesian Networks (BNs), this
challenge is known as learning under causal insufficiency. Structure learning
algorithms that assume causal insufficiency tend to reconstruct the ancestral
graph of a BN, where bi-directed edges represent confounding and directed edges
represent direct or ancestral relationships. This paper describes a hybrid
structure learning algorithm, called CCHM, which combines the constraint-based
part of cFCI with hill-climbing score-based learning. The score-based process
incorporates Pearl s do-calculus to measure causal effects and orientate edges
that would otherwise remain undirected, under the assumption the BN is a linear
Structure Equation Model where data follow a multivariate Gaussian
distribution. Experiments based on both randomised and well-known networks show
that CCHM improves the state-of-the-art in terms of reconstructing the true
ancestral graph
Learning Bayesian networks from demographic and health survey data.
Child mortality from preventable diseases such as pneumonia and diarrhoea in
low and middle-income countries remains a serious global challenge. We combine
knowledge with available Demographic and Health Survey (DHS) data from India,
to construct Causal Bayesian Networks (CBNs) and investigate the factors
associated with childhood diarrhoea. We make use of freeware tools to learn the
graphical structure of the DHS data with score-based, constraint-based, and
hybrid structure learning algorithms. We investigate the effect of missing
values, sample size, and knowledge-based constraints on each of the structure
learning algorithms and assess their accuracy with multiple scoring functions.
Weaknesses in the survey methodology and data available, as well as the
variability in the CBNs generated by the different algorithms, mean that it is
not possible to learn a definitive CBN from data. However, knowledge-based
constraints are found to be useful in reducing the variation in the graphs
produced by the different algorithms, and produce graphs which are more
reflective of the likely influential relationships in the data. Furthermore,
valuable insights are gained into the performance and characteristics of the
structure learning algorithms. Two score-based algorithms in particular, TABU
and FGES, demonstrate many desirable qualities; a) with sufficient data, they
produce a graph which is similar to the reference graph, b) they are relatively
insensitive to missing values, and c) behave well with knowledge-based
constraints. The results provide a basis for further investigation of the DHS
data and for a deeper understanding of the behaviour of the structure learning
algorithms when applied to real-world settings
Approximate learning of high dimensional Bayesian network structures via pruning of Candidate Parent Sets.
Score-based algorithms that learn Bayesian Network (BN) structures provide
solutions ranging from different levels of approximate learning to exact
learning. Approximate solutions exist because exact learning is generally not
applicable to networks of moderate or higher complexity. In general,
approximate solutions tend to sacrifice accuracy for speed, where the aim is to
minimise the loss in accuracy and maximise the gain in speed. While some
approximate algorithms are optimised to handle thousands of variables, these
algorithms may still be unable to learn such high dimensional structures. Some
of the most efficient score-based algorithms cast the structure learning
problem as a combinatorial optimisation of candidate parent sets. This paper
explores a strategy towards pruning the size of candidate parent sets, aimed at
high dimensionality problems. The results illustrate how different levels of
pruning affect the learning speed relative to the loss in accuracy in terms of
model fitting, and show that aggressive pruning may be required to produce
approximate solutions for high complexity problems
Simpson's Paradox and the implications for medical trials.
This paper describes Simpson's paradox, and explains its serious implications
for randomised control trials. In particular, we show that for any number of
variables we can simulate the result of a controlled trial which uniformly
points to one conclusion (such as 'drug is effective') for every possible
combination of the variable states, but when a previously unobserved
confounding variable is included every possible combination of the variables
state points to the opposite conclusion ('drug is not effective'). In other
words no matter how many variables are considered, and no matter how
'conclusive' the result, one cannot conclude the result is truly 'valid' since
there is theoretically an unobserved confounding variable that could completely
reverse the result
Model of host-pathogen Interaction dynamics links In vivo optical imaging and immune responses
Tracking disease progression in vivo is essential for the development of treatments against bacterial infection. Optical imaging has become a central tool for in vivo tracking of bacterial population development and therapeutic response. For a precise understanding of in vivo imaging results in terms of disease mechanisms derived from detailed postmortem observations, however, a link between the two is needed. Here, we develop a model that provides that link for the investigation of Citrobacter rodentium infection, a mouse model for enteropathogenic Escherichia coli (EPEC). We connect in vivo disease progression of C57BL/6 mice infected with bioluminescent bacteria, imaged using optical tomography and X-ray computed tomography, to postmortem measurements of colonic immune cell infiltration. We use the model to explore changes to both the host immune response and the bacteria and to evaluate the response to antibiotic treatment. The developed model serves as a novel tool for the identification and development of new therapeutic interventions
Value of information analysis for interventional and counterfactual Bayesian networks in forensic medical sciences
Objectives: Inspired by real-world examples from the forensic medical sciences domain, we seek to determine whether a decision about an interventional action could be subject to amendments on the basis of some incomplete information within the model, and whether it would be worthwhile for the decision maker to seek further information prior to suggesting a decision
Risk assessment and risk management of violent reoffending among prisoners
“The final publication is available at Springer via http://dx.doi.org/10.1016/j.eswa.2015.05.025”
- …