4 research outputs found
Quantifying the Re-identification Risk of Event Logs for Process Mining
Event logs recorded during the execution of business processes constitute a
valuable source of information. Applying process mining techniques to them,
event logs may reveal the actual process execution and enable reasoning on
quantitative or qualitative process properties. However, event logs often
contain sensitive information that could be related to individual process
stakeholders through background information and cross-correlation. We therefore
argue that, when publishing event logs, the risk of such re-identification
attacks must be considered. In this paper, we show how to quantify the
re-identification risk with measures for the individual uniqueness in event
logs. We also report on a large-scale study that explored the individual
uniqueness in a collection of publicly available event logs. Our results
suggest that potentially up to all of the cases in an event log may be
re-identified, which highlights the importance of privacy-preserving techniques
in process mining.Comment: Accepted to CAiSE-202
On the Uncertain Single-View Depths in Colonoscopies
Estimating depth information from endoscopic images is a prerequisite for a
wide set of AI-assisted technologies, such as accurate localization and
measurement of tumors, or identification of non-inspected areas. As the domain
specificity of colonoscopies -- deformable low-texture environments with
fluids, poor lighting conditions and abrupt sensor motions -- pose challenges
to multi-view 3D reconstructions, single-view depth learning stands out as a
promising line of research. Depth learning can be extended in a Bayesian
setting, which enables continual learning, improves decision making and can be
used to compute confidence intervals or quantify uncertainty for in-body
measurements. In this paper, we explore for the first time Bayesian deep
networks for single-view depth estimation in colonoscopies. Our specific
contribution is two-fold: 1) an exhaustive analysis of scalable Bayesian
networks for depth learning in different datasets, highlighting challenges and
conclusions regarding synthetic-to-real domain changes and supervised vs.
self-supervised methods; and 2) a novel teacher-student approach to deep depth
learning that takes into account the teacher uncertainty.Comment: 11 page
Sample-Specific Root Causal Inference with Latent Variables
Root causal analysis seeks to identify the set of initial perturbations that
induce an unwanted outcome. In prior work, we defined sample-specific root
causes of disease using exogenous error terms that predict a diagnosis in a
structural equation model. We rigorously quantified predictivity using Shapley
values. However, the associated algorithms for inferring root causes assume no
latent confounding. We relax this assumption by permitting confounding among
the predictors. We then introduce a corresponding procedure called Extract
Errors with Latents (EEL) for recovering the error terms up to contamination by
vertices on certain paths under the linear non-Gaussian acyclic model. EEL also
identifies the smallest sets of dependent errors for fast computation of the
Shapley values. The algorithm bypasses the hard problem of estimating the
underlying causal graph in both cases. Experiments highlight the superior
accuracy and robustness of EEL relative to its predecessors
Analysing the Impact of Changes in User Interface of e-Health Record Systems on Clinical Pathways using Process Mining
The provision of care in a hospital includes a series of activities that are often recorded in the electronic health record (EHR) systems. Analysing the data in these EHRs has the potential to support the understanding of care processes and exploring the opportunities for process improvement.
One of the emerging data analytics approaches for such analyses is process mining, and one critical challenge in working with EHR data is that processes might change over time. This thesis uses a process mining approach to detect process change over time and analyse the impact of those changes on the EHR data. The overall aim is to
summarise the attributable change in the data due to the process so that clinicians can better analyse the data.
Three datasets were used in this study to understand the variability of the EHR systems. The first dataset is a publicly available EHR data that was used for developing the methods and supporting the reproducibility of the research. The second dataset is a de-identified subset of the database of cancer patients from the Leeds Cancer Centre. The second dataset was used in the experiments to improve on the results of a previous study using the same dataset. The third dataset was the full Leeds Cancer Centre EHR database after more comprehensive ethics was approved. In the third dataset, experiments were done to analyse the impact of a known system change on clinical pathways and to explore process change over time without a known system change. All three datasets were analysed using process mining.
Process mining was shown to be useful for analysing clinical pathways and exploring process changes over time. It can be used to visualise the process before and after a known change. When the system change is unknown, process mining can be used to explore the process execution over time and identify the potential period where the system was changed. This thesis explores some aspects of the complex interrelatedness of process and user interface (UI) of the EHR system