842 research outputs found
Fusing Continuous-valued Medical Labels using a Bayesian Model
With the rapid increase in volume of time series medical data available
through wearable devices, there is a need to employ automated algorithms to
label data. Examples of labels include interventions, changes in activity (e.g.
sleep) and changes in physiology (e.g. arrhythmias). However, automated
algorithms tend to be unreliable resulting in lower quality care. Expert
annotations are scarce, expensive, and prone to significant inter- and
intra-observer variance. To address these problems, a Bayesian
Continuous-valued Label Aggregator(BCLA) is proposed to provide a reliable
estimation of label aggregation while accurately infer the precision and bias
of each algorithm. The BCLA was applied to QT interval (pro-arrhythmic
indicator) estimation from the electrocardiogram using labels from the 2006
PhysioNet/Computing in Cardiology Challenge database. It was compared to the
mean, median, and a previously proposed Expectation Maximization (EM) label
aggregation approaches. While accurately predicting each labelling algorithm's
bias and precision, the root-mean-square error of the BCLA was
11.780.63ms, significantly outperforming the best Challenge entry
(15.372.13ms) as well as the EM, mean, and median voting strategies
(14.760.52ms, 17.610.55ms, and 14.430.57ms respectively with
)
Unleashing the power of federated learning in fragmented digital healthcare systems: a visionary perspective
Digital healthcare landscape, including infrastructure, governance, interoperability, and user adoption, are continuously evolving, some taking more centralised approach, while others with higher degree of fragmentation. Attitude towards centralised healthcare systems in affluent countries are primarily influenced by historical development, infrastructure investments, and regulatory frameworks, which offers advantages with respect to standardised practises, centralised decision making, and economies of scale. In contrast, complexities due to diverse stakeholders, interoperability challenges, privacy and security concerns often pose challenges in achieving a completely centralised healthcare system even in high income countries such as the United Kingdom or in federal systems such as the United States. Moreover, decentralised healthcare systems are more prevalent in resource-poor countries. This paper presents our viewpoint and perspectives on the potential of federated learning in decentralised healthcare systems, especially in countries with infrastructure constraints and discusses its advantages, privacy and security concerns, and challenges. As data-hungry artificial intelligence-enabled systems are gradually changing the healthcare ecosystem, federated learning presents an opportunity for distributing the machine learning training process across multiple decentralised edge devices with reduced data transfer. Therefore, the decentralised digital healthcare system can leverage the collaborative model training while protecting highly sensitive and personal health information. However, challenges related to data heterogeneity, communication latency, and model aggregation need to be addressed for successful implementation of such systems. Adapting the federated learning framework to the specific needs and constraints of low and middle-income countries is crucial to unlock its potential in improving healthcare outcomes
One-class classification of point patterns of extremes
Novelty detection or one-class classification starts from a model describing some type of 'normal behaviour' and aims to classify deviations from this model as being either novelties or anomalies.
In this paper the problem of novelty detection for point patterns S = {X-1 ,..., X-k} subset of R-d is treated where examples of anomalies are very sparse, or even absent. The latter complicates the tuning of hyperparameters in models commonly used for novelty detection, such as one-class support vector machines and hidden Markov models.
To this end, the use of extreme value statistics is introduced to estimate explicitly a model for the abnormal class by means of extrapolation from a statistical model X for the normal class. We show how multiple types of information obtained from any available extreme instances of S can be combined to reduce the high false-alarm rate that is typically encountered when classes are strongly imbalanced, as often occurs in the one-class setting (whereby 'abnormal' data are often scarce).
The approach is illustrated using simulated data and then a real-life application is used as an exemplar, whereby accelerometry data from epileptic seizures are analysed - these are known to be extreme and rare with respect to normal accelerometer data
Modelling physiological deterioration in post-operative patient vital-sign data
Patients who undergo upper-gastrointestinal surgery have a high incidence of post-operative complications, often requiring admission to the intensive care unit several days after surgery. A dataset comprising observational vital-sign data from 171 post-operative patients taking part in a two-phase clinical trial at the Oxford Cancer Centre, was used to explore the trajectory of patients’ vital-sign changes during their stay in the post-operative ward using both univariate and multivariate analyses. A model of normality based vital-sign data from patients who had a “normal” recovery was constructed using a kernel density estimate, and tested with “abnormal” data from patients who deteriorated sufficiently to be re-admitted to the intensive care unit. The vital-sign distributions from “normal” patients were found to vary over time from admission to the post-operative ward to their discharge home, but no significant changes in their distributions were observed from halfway through their stay on the ward to the time of discharge. The model of normality identified patient deterioration when tested with unseen “abnormal” data, suggesting that such techniques may be used to provide early warning of adverse physiological events
SoCal: Selective Oracle Questioning for Consistency-based Active Learning of Cardiac Signals
The ubiquity and rate of collection of cardiac signals produce large,
unlabelled datasets. Active learning (AL) can exploit such datasets by
incorporating human annotators (oracles) to improve generalization performance.
However, the over-reliance of existing algorithms on oracles continues to
burden physicians. To minimize this burden, we propose SoCal, a
consistency-based AL framework that dynamically determines whether to request a
label from an oracle or to generate a pseudo-label instead. We show that our
framework decreases the labelling burden while maintaining strong performance,
even in the presence of a noisy oracle
CLOCS: Contrastive Learning of Cardiac Signals Across Space, Time, and Patients
The healthcare industry generates troves of unlabelled physiological data.
This data can be exploited via contrastive learning, a self-supervised
pre-training method that encourages representations of instances to be similar
to one another. We propose a family of contrastive learning methods, CLOCS,
that encourages representations across space, time, \textit{and} patients to be
similar to one another. We show that CLOCS consistently outperforms the
state-of-the-art methods, BYOL and SimCLR, when performing a linear evaluation
of, and fine-tuning on, downstream tasks. We also show that CLOCS achieves
strong generalization performance with only 25\% of labelled training data.
Furthermore, our training procedure naturally generates patient-specific
representations that can be used to quantify patient-similarity
Exploring the Effectiveness of Instruction Tuning in Biomedical Language Processing
Large Language Models (LLMs), particularly those similar to ChatGPT, have
significantly influenced the field of Natural Language Processing (NLP). While
these models excel in general language tasks, their performance in
domain-specific downstream tasks such as biomedical and clinical Named Entity
Recognition (NER), Relation Extraction (RE), and Medical Natural Language
Inference (NLI) is still evolving. In this context, our study investigates the
potential of instruction tuning for biomedical language processing, applying
this technique to two general LLMs of substantial scale. We present a
comprehensive, instruction-based model trained on a dataset that consists of
approximately instruction-focused samples. This dataset represents a
carefully curated compilation of existing data, meticulously adapted and
reformatted to align with the specific requirements of our instruction-based
tasks. This initiative represents an important step in utilising such models to
achieve results on par with specialised encoder-only models like BioBERT and
BioClinicalBERT for various classical biomedical NLP tasks. Our work includes
an analysis of the dataset's composition and its impact on model performance,
providing insights into the intricacies of instruction tuning. By sharing our
codes, models, and the distinctively assembled instruction-based dataset, we
seek to encourage ongoing research and development in this area
- …