Search CORE

842 research outputs found

Fusing Continuous-valued Medical Labels using a Bayesian Model

Author: Behar Joachim
Clifford Gari D.
Clifton David A.
Dunkley Nic
Zhu Tingting
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

With the rapid increase in volume of time series medical data available through wearable devices, there is a need to employ automated algorithms to label data. Examples of labels include interventions, changes in activity (e.g. sleep) and changes in physiology (e.g. arrhythmias). However, automated algorithms tend to be unreliable resulting in lower quality care. Expert annotations are scarce, expensive, and prone to significant inter- and intra-observer variance. To address these problems, a Bayesian Continuous-valued Label Aggregator(BCLA) is proposed to provide a reliable estimation of label aggregation while accurately infer the precision and bias of each algorithm. The BCLA was applied to QT interval (pro-arrhythmic indicator) estimation from the electrocardiogram using labels from the 2006 PhysioNet/Computing in Cardiology Challenge database. It was compared to the mean, median, and a previously proposed Expectation Maximization (EM) label aggregation approaches. While accurately predicting each labelling algorithm's bias and precision, the root-mean-square error of the BCLA was 11.78

\pm

0.63ms, significantly outperforming the best Challenge entry (15.37

\pm

2.13ms) as well as the EM, mean, and median voting strategies (14.76

\pm

0.52ms, 17.61

\pm

0.55ms, and 14.43

\pm

0.57ms respectively with

p<0.0001

)

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Unleashing the power of federated learning in fragmented digital healthcare systems: a visionary perspective

Author: Clifton David A
Tania Marzia Hoque
Publication venue: IEEE
Publication date: 23/01/2024
Field of study

Digital healthcare landscape, including infrastructure, governance, interoperability, and user adoption, are continuously evolving, some taking more centralised approach, while others with higher degree of fragmentation. Attitude towards centralised healthcare systems in affluent countries are primarily influenced by historical development, infrastructure investments, and regulatory frameworks, which offers advantages with respect to standardised practises, centralised decision making, and economies of scale. In contrast, complexities due to diverse stakeholders, interoperability challenges, privacy and security concerns often pose challenges in achieving a completely centralised healthcare system even in high income countries such as the United Kingdom or in federal systems such as the United States. Moreover, decentralised healthcare systems are more prevalent in resource-poor countries. This paper presents our viewpoint and perspectives on the potential of federated learning in decentralised healthcare systems, especially in countries with infrastructure constraints and discusses its advantages, privacy and security concerns, and challenges. As data-hungry artificial intelligence-enabled systems are gradually changing the healthcare ecosystem, federated learning presents an opportunity for distributing the machine learning training process across multiple decentralised edge devices with reduced data transfer. Therefore, the decentralised digital healthcare system can leverage the collaborative model training while protecting highly sensitive and personal health information. However, challenges related to data heterogeneity, communication latency, and model aggregation need to be addressed for successful implementation of such systems. Adapting the federated learning framework to the specific needs and constraints of low and middle-income countries is crucial to unlock its potential in improving healthcare outcomes

Oxford University Research Archive

One-class classification of point patterns of extremes

Author: Clifton David A
Luca Stijn
Vanrumste Bart
Publication venue
Publication date: 01/01/2016
Field of study

Novelty detection or one-class classification starts from a model describing some type of 'normal behaviour' and aims to classify deviations from this model as being either novelties or anomalies. In this paper the problem of novelty detection for point patterns S = {X-1 ,..., X-k} subset of R-d is treated where examples of anomalies are very sparse, or even absent. The latter complicates the tuning of hyperparameters in models commonly used for novelty detection, such as one-class support vector machines and hidden Markov models. To this end, the use of extreme value statistics is introduced to estimate explicitly a model for the abnormal class by means of extrapolation from a statistical model X for the normal class. We show how multiple types of information obtained from any available extreme instances of S can be combined to reduce the high false-alarm rate that is typically encountered when classes are strongly imbalanced, as often occurs in the one-class setting (whereby 'abnormal' data are often scarce). The approach is illustrated using simulated data and then a real-life application is used as an exemplar, whereby accelerometry data from epileptic seizures are analysed - these are known to be extreme and rare with respect to normal accelerometer data

Ghent University Academic Bibliography

Oxford University Research Archive

Modelling physiological deterioration in post-operative patient vital-sign data

Author: David A. Clifton
Lei Clifton
Lionel Tarassenko
Marco A. F. Pimentel
Peter J. Watkinson
Publication venue: Springer Nature
Publication date: 01/01/2013
Field of study

Patients who undergo upper-gastrointestinal surgery have a high incidence of post-operative complications, often requiring admission to the intensive care unit several days after surgery. A dataset comprising observational vital-sign data from 171 post-operative patients taking part in a two-phase clinical trial at the Oxford Cancer Centre, was used to explore the trajectory of patients’ vital-sign changes during their stay in the post-operative ward using both univariate and multivariate analyses. A model of normality based vital-sign data from patients who had a “normal” recovery was constructed using a kernel density estimate, and tested with “abnormal” data from patients who deteriorated sufficiently to be re-admitted to the intensive care unit. The vital-sign distributions from “normal” patients were found to vary over time from admission to the post-operative ward to their discharge home, but no significant changes in their distributions were observed from halfway through their stay on the ward to the time of discharge. The model of normality identified patient deterioration when tested with unseen “abnormal” data, suggesting that such techniques may be used to provide early warning of adverse physiological events

Springer - Publisher Connector

PubMed Central

Oxford University Research Archive

SoCal: Selective Oracle Questioning for Consistency-based Active Learning of Cardiac Signals

Author: Clifton David A.
Kiyasseh Dani
Zhu Tingting
Publication venue
Publication date: 28/11/2020
Field of study

The ubiquity and rate of collection of cardiac signals produce large, unlabelled datasets. Active learning (AL) can exploit such datasets by incorporating human annotators (oracles) to improve generalization performance. However, the over-reliance of existing algorithms on oracles continues to burden physicians. To minimize this burden, we propose SoCal, a consistency-based AL framework that dynamically determines whether to request a label from an oracle or to generate a pseudo-label instead. We show that our framework decreases the labelling burden while maintaining strong performance, even in the presence of a noisy oracle

arXiv.org e-Print Archive

Oxford University Research Archive

CLOCS: Contrastive Learning of Cardiac Signals Across Space, Time, and Patients

Author: Clifton David A.
Kiyasseh Dani
Zhu Tingting
Publication venue
Publication date: 30/11/2020
Field of study

The healthcare industry generates troves of unlabelled physiological data. This data can be exploited via contrastive learning, a self-supervised pre-training method that encourages representations of instances to be similar to one another. We propose a family of contrastive learning methods, CLOCS, that encourages representations across space, time, \textit{and} patients to be similar to one another. We show that CLOCS consistently outperforms the state-of-the-art methods, BYOL and SimCLR, when performing a linear evaluation of, and fine-tuning on, downstream tasks. We also show that CLOCS achieves strong generalization performance with only 25\% of labelled training data. Furthermore, our training procedure naturally generates patient-specific representations that can be used to quantify patient-similarity

arXiv.org e-Print Archive

Oxford University Research Archive

Exploring the Effectiveness of Instruction Tuning in Biomedical Language Processing

Author: Clifton David A.
Nouriborji Mohammadmahdi
Rohanian Omid
Publication venue
Publication date: 31/12/2023
Field of study

Large Language Models (LLMs), particularly those similar to ChatGPT, have significantly influenced the field of Natural Language Processing (NLP). While these models excel in general language tasks, their performance in domain-specific downstream tasks such as biomedical and clinical Named Entity Recognition (NER), Relation Extraction (RE), and Medical Natural Language Inference (NLI) is still evolving. In this context, our study investigates the potential of instruction tuning for biomedical language processing, applying this technique to two general LLMs of substantial scale. We present a comprehensive, instruction-based model trained on a dataset that consists of approximately

200,000

instruction-focused samples. This dataset represents a carefully curated compilation of existing data, meticulously adapted and reformatted to align with the specific requirements of our instruction-based tasks. This initiative represents an important step in utilising such models to achieve results on par with specialised encoder-only models like BioBERT and BioClinicalBERT for various classical biomedical NLP tasks. Our work includes an analysis of the dataset's composition and its impact on model performance, providing insights into the intricacies of instruction tuning. By sharing our codes, models, and the distinctively assembled instruction-based dataset, we seek to encourage ongoing research and development in this area

arXiv.org e-Print Archive